PointFlowNet: Learning Representations for Rigid Motion Estimation From Point Clouds

Aseem Behl, Despoina Paschalidou, Simon Donne, Andreas Geiger; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 7962-7971

Abstract


Despite significant progress in image-based 3D scene flow estimation, the performance of such approaches has not yet reached the fidelity required by many applications. Simultaneously, these applications are often not restricted to image-based estimation: laser scanners provide a popular alternative to traditional cameras, for example in the context of self-driving cars, as they directly yield a 3D point cloud. In this paper, we propose to estimate 3D motion from such unstructured point clouds using a deep neural network. In a single forward pass, our model jointly predicts 3D scene flow as well as the 3D bounding box and rigid body motion of objects in the scene. While the prospect of estimating 3D scene flow from unstructured point clouds is promising, it is also a challenging task. We show that the traditional global representation of rigid body motion prohibits inference by CNNs, and propose a translation equivariant representation to circumvent this problem. For training our deep network, a large dataset is required. Because of this, we augment real scans from KITTI with virtual objects, realistically modeling occlusions and simulating sensor noise. A thorough comparison with classic and learning-based techniques highlights the robustness of the proposed approach.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Behl_2019_CVPR,
author = {Behl, Aseem and Paschalidou, Despoina and Donne, Simon and Geiger, Andreas},
title = {PointFlowNet: Learning Representations for Rigid Motion Estimation From Point Clouds},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}