Forecasting Hands and Objects in Future Frames

Fan, Chenyou; Lee, Jangwon; Ryoo, Michael S.

Chenyou Fan, Jangwon Lee, Michael S. Ryoo; Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0-0

Abstract

This paper presents an approach to forecast future presence and location of human hands and objects. Given an image frame, the goal is to predict what objects will appear in the future frame (e.g., 5 seconds later) and where they will be located at, even when they are not visible in the current frame. The key idea is that (1) an intermediate representation of a convolutional object recognition model abstracts scene information in its frame and that (2) we can predict (i.e., regress) such representations corresponding to the future frames based on that of the current frame. We present a new two-stream fully convolutional neural network (CNN) architecture designed for forecasting future objects given a video. The experiments confirm that our approach allows reliable estimation of future objects in videos, obtaining much higher accuracy compared to the stateof-the-art future object presence forecast method on public datasets.

Related Material

[pdf]

[bibtex]

@InProceedings{Fan_2018_ECCV_Workshops,
author = {Fan, Chenyou and Lee, Jangwon and Ryoo, Michael S.},
title = {Forecasting Hands and Objects in Future Frames},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV) Workshops},
month = {September},
year = {2018}
}