Recurrent Assistance: Cross-Dataset Training of LSTMs on Kitchen Tasks

Toby Perrett, Dima Damen; Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 1354-1362


In this paper, we investigate whether it is possible to leverage information from multiple datasets when performing frame-based action recognition, which is an essential component of real-time activity monitoring systems. In particular, we investigate whether the training of an LSTM can benefit from pre-training or co-training on multiple datasets of related tasks when it uses non-transferred visual CNN features. A number of label mappings and multi-dataset training techniques are proposed and tested on three challenging kitchen activity datasets - Breakfast, 50 Salads and MPII Cooking 2. We show that transferring, by pre-training on similar datasets using label concatenation, delivers improved frame-based classification accuracy and faster training convergence than random initialisation.

Related Material

author = {Perrett, Toby and Damen, Dima},
title = {Recurrent Assistance: Cross-Dataset Training of LSTMs on Kitchen Tasks},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops},
month = {Oct},
year = {2017}