Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals

Katsuyuki Nakamura, Serena Yeung, Alexandre Alahi, Li Fei-Fei; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1868-1877

Abstract


Physiological signals such as heart rate can provide valuable information about an individual's state and activity. However, existing work on computer vision has not yet explored leveraging these signals to enhance egocentric video understanding. In this work, we propose a model for reasoning on multimodal data to jointly predict activities and energy expenditures. We use heart rate signals as privileged self-supervision to derive energy expenditure in a training stage. A multitask objective is used to jointly optimize the two tasks. Additionally, we introduce a dataset that contains 31 hours of egocentric video augmented with heart rate and acceleration signals. This study can lead to new applications such as a visual calorie counter.

Related Material


[pdf]
[bibtex]
@InProceedings{Nakamura_2017_CVPR,
author = {Nakamura, Katsuyuki and Yeung, Serena and Alahi, Alexandre and Fei-Fei, Li},
title = {Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {July},
year = {2017}
}