NVAutoNet: Fast and Accurate 360deg 3D Visual Perception for Self Driving

Pham, Trung; Maghoumi, Mehran; Jiang, Wanli; Jujjavarapu, Bala Siva Sashank; Sajjadi, Mehdi; Liu, Xin; Lin, Hsuan-Chu; Chen, Bor-Jeng; Truong, Giang; Fang, Chao; Kwon, Junghyun; Park, Minwoo

Trung Pham, Mehran Maghoumi, Wanli Jiang, Bala Siva Sashank Jujjavarapu, Mehdi Sajjadi, Xin Liu, Hsuan-Chu Lin, Bor-Jeng Chen, Giang Truong, Chao Fang, Junghyun Kwon, Minwoo Park; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 7376-7385

Abstract

Achieving robust and real-time 3D perception is fundamental for autonomous vehicles. While most existing 3D perception methods prioritize detection accuracy, they often overlook critical aspects such as computational efficiency, onboard chip deployment friendliness, resilience to sensor mounting deviations, and adaptability to various vehicle types. To address these challenges, we present NVAutoNet: a specialized Bird's-Eye-View (BEV) perception network tailored explicitly for automated vehicles. NVAutoNet takes synchronized camera images as input and predicts 3D signals like obstacles, freespaces, and parking spaces. The core of NVAutoNet's architecture (image and BEV backbones) relies on efficient convolutional networks, optimized for high performance using TensorRT. Our image-to-BEV transformation employs simple linear layers and BEV look-up tables, ensuring rapid inference speed. Trained on an extensive proprietary dataset, NVAutoNet consistently achieves elevated perception accuracy, operating remarkably at 53 frames per second on the NVIDIA DRIVE Orin SoC. Notably, NVAutoNet demonstrates resilience to sensor mounting deviations arising from diverse car models. Moreover, NVAutoNet excels in adapting to varied vehicle types, facilitated by inexpensive model fine-tuning procedures that expedite compatibility adjustments.

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Pham_2024_WACV, author = {Pham, Trung and Maghoumi, Mehran and Jiang, Wanli and Jujjavarapu, Bala Siva Sashank and Sajjadi, Mehdi and Liu, Xin and Lin, Hsuan-Chu and Chen, Bor-Jeng and Truong, Giang and Fang, Chao and Kwon, Junghyun and Park, Minwoo}, title = {NVAutoNet: Fast and Accurate 360deg 3D Visual Perception for Self Driving}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {7376-7385} }