Cylindrical Convolutional Networks for Joint Object Detection and Viewpoint Estimation
Sunghun Joung, Seungryong Kim, Hanjae Kim, Minsu Kim, Ig-Jae Kim, Junghyun Cho, Kwanghoon Sohn; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 14163-14172
Abstract
Existing techniques to encode spatial invariance within deep convolutional neural networks only model 2D transformation fields. This does not account for the fact that objects in a 2D space are a projection of 3D ones, and thus they have limited ability to severe object viewpoint changes. To overcome this limitation, we introduce a learnable module, cylindrical convolutional networks (CCNs), that exploit cylindrical representation of a convolutional kernel defined in the 3D space. CCNs extract a view-specific feature through a view-specific convolutional kernel to predict object category scores at each viewpoint. With the view-specific feature, we simultaneously determine objective category and viewpoints using the proposed sinusoidal soft-argmax module. Our experiments demonstrate the effectiveness of the cylindrical convolutional networks on joint object detection and viewpoint estimation.
Related Material
[pdf]
[arXiv]
[video]
[
bibtex]
@InProceedings{Joung_2020_CVPR,
author = {Joung, Sunghun and Kim, Seungryong and Kim, Hanjae and Kim, Minsu and Kim, Ig-Jae and Cho, Junghyun and Sohn, Kwanghoon},
title = {Cylindrical Convolutional Networks for Joint Object Detection and Viewpoint Estimation},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}