Egocentric Activity Prediction via Event Modulated Attention

Yang Shen, Bingbing Ni, Zefan Li, Ning Zhuang; Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 197-212


Predicting future activities from an egocentric viewpoint is of particular interest in assisted living. However, state-of-the-art egocentric activity understanding techniques are mostly NOT capable of predictive tasks, as their synchronous processing architecture performs poorly in either modeling event dependency or pruning temporal redundant features. This work explicitly addresses these issues by proposing an asynchronous gaze-event driven attentive activity prediction network. This network is built on a gaze-event extraction module inspired by the fact that gaze moving in/out a certain object most probably indicates the occurrence/ending of a certain activity. The extracted gaze events are input to: 1) an asynchronous module which reasons about the temporal dependency between events and 2) a synchronous module which softly attends to informative temporal durations for more compact and discriminative feature extraction. Both modules are seamlessly integrated for collaborative prediction. Extensive experimental results on egocentric activity prediction as well as recognition well demonstrate the effectiveness of the proposed method.

Related Material

author = {Shen, Yang and Ni, Bingbing and Li, Zefan and Zhuang, Ning},
title = {Egocentric Activity Prediction via Event Modulated Attention},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
month = {September},
year = {2018}