Tracked-Vehicle Retrieval by Natural Language Descriptions With Domain Adaptive Knowledge

Huy Dinh-Anh Le, Quang Qui-Vinh Nguyen, Vuong Ai Nguyen, Thong Duy-Minh Nguyen, Nhat Minh Chung, Tin-Trung Thái, Synh Viet-Uyen Ha; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 3300-3309

Abstract


This paper introduces our solution for Track 2 in AI City Challenge 2022. Track 2 task is TrackedVehicle Retrieval by Natural Language Descriptions with a real-world dataset with different scenarios and multi-camera. We mainly focus on developing a robust natural language-based vehicle retrieval system to address the domain bias problem due to unseen scenarios and multi-view multi-camera vehicle tracks. Specifically, we apply CLIP to effectively extract both visual and textual representation for contrastive representation learning. Furthermore, Since there are new scenarios in the test set, we propose a new Domain Adaptive Training method that utilizes the information from labeled data and transfers it to unlabeled data to generate pseudo labels. By using this simple and effective strategy, we not only breach the domain gap between the training set and test set but also require less computation cost and data compared to previous top performance methods. Finally, we use a post-processing method called pruning to eliminate the wrong retrieved vehicle track. Taking one step further, we also investigate the impact of different text formats and the number of pseudo labels data for the fine-tuning process. Our proposed method has achieved 3rd place on the AI City Challenge 2022, yielding a competitive performance of 47.73% MRR accuracy on the private test set, which verified the effectiveness and scalability of the proposed solution.

Related Material


[pdf]
[bibtex]
@InProceedings{Le_2022_CVPR, author = {Le, Huy Dinh-Anh and Nguyen, Quang Qui-Vinh and Nguyen, Vuong Ai and Nguyen, Thong Duy-Minh and Chung, Nhat Minh and Th\'ai, Tin-Trung and Ha, Synh Viet-Uyen}, title = {Tracked-Vehicle Retrieval by Natural Language Descriptions With Domain Adaptive Knowledge}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2022}, pages = {3300-3309} }