Learning End-to-End Action Interaction by Paired-Embedding Data Augmentation

Ziyang Song, Zejian Yuan, Chong Zhang, Wanchao Chi, Yonggen Ling, Shenghao Zhang; Proceedings of the Asian Conference on Computer Vision (ACCV), 2020


In recognition-based action interaction, robots' responses to human actions are often pre-designed according to recognized categories and thus stiff.In this paper, we specify a new Interactive Action Translation (IAT) task which aims to learn end-to-end action interaction from unlabeled interactive pairs, removing explicit action recognition.To enable learning on small-scale data, we propose a Paired-Embedding (PE) method for effective and reliable data augmentation.Specifically, our method first utilizes paired relationships to cluster individual actions in an embedding space.Then two actions originally paired can be replaced with other actions in their respective neighborhood, assembling into new pairs.An Act2Act network based on conditional GAN follows to learn from augmented data.Besides, IAT-test and IAT-train scores are specifically proposed for evaluating methods on our task.Experimental results on two datasets show impressive effects and broad application prospects of our method.

Related Material

[pdf] [arXiv]
@InProceedings{Song_2020_ACCV, author = {Song, Ziyang and Yuan, Zejian and Zhang, Chong and Chi, Wanchao and Ling, Yonggen and Zhang, Shenghao}, title = {Learning End-to-End Action Interaction by Paired-Embedding Data Augmentation}, booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)}, month = {November}, year = {2020} }