Towards Accurate Visual and Natural Language-Based Vehicle Retrieval Systems

Pirazh Khorramshahi, Sai Saketh Rambhatla, Rama Chellappa; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021, pp. 4183-4192

Abstract


In this work, we consider two tracks of the 2021 NVIDIA AI City Challenge, the City-Scale Multi-Camera Vehicle Re-identification and Natural language-based Vehicle Retrieval. For the vehicle re-identification task, we employ the state-of-art Excited Vehicle Re-Identification deep representation learning model coupled with best training practices and domain adaptation techniques to obtain robust embeddings. We further refine the re-identification results through a series of post-processing steps to remove camera and vehicle orientation bias that is inherent in the task of re-identification. We also take advantage of multiple observations of a vehicle using track-level information and finally obtain fine-grained retrieval results. For the task of Natural language-based vehicle retrieval we leverage the recently proposed Contrastive Language-Image Pre-training model and propose a simple yet effective text-based vehicle retrieval system. We compare our performance against the top submissions to the challenge and our systems are ranked 8^\text th in the public leaderboard for both tracks.

Related Material


[pdf]
[bibtex]
@InProceedings{Khorramshahi_2021_CVPR, author = {Khorramshahi, Pirazh and Rambhatla, Sai Saketh and Chellappa, Rama}, title = {Towards Accurate Visual and Natural Language-Based Vehicle Retrieval Systems}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2021}, pages = {4183-4192} }