Thin-Slicing for Pose: Learning to Understand Pose Without Explicit Pose Estimation

Suha Kwak, Minsu Cho, Ivan Laptev; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4938-4947

Abstract


We address the problem of learning a pose-aware, compact embedding that projects images with similar human poses to be placed close-by in the embedding space. The embedding function is built on a deep convolutional network, and trained with triplet-based rank constraints on real image data. This architecture allows us to learn a robust representation that captures differences in human poses by effectively factoring out variations in clothing, background, and imaging conditions in the wild. For a variety of pose-related tasks, the proposed pose embedding provides a cost-efficient and natural alternative to explicit pose estimation, circumventing challenges of localizing body joints. We demonstrate the efficacy of the embedding on pose-based image retrieval and action recognition problems.

Related Material


[pdf]
[bibtex]
@InProceedings{Kwak_2016_CVPR,
author = {Kwak, Suha and Cho, Minsu and Laptev, Ivan},
title = {Thin-Slicing for Pose: Learning to Understand Pose Without Explicit Pose Estimation},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2016}
}