Dynamic Texts From UAV Perspective Natural Images

Hidetomo Sakaino; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2023, pp. 2070-2081

Abstract


Drone-based image processing offers valuable capabilities for surveillance, detection, and tracking in vast areas, aiding in disaster search and rescue and monitoring artificial events like traffic jams and outdoor activities under adversarial weather conditions. Nonetheless, this technology encounters numerous challenges, including handling variations in scales and perspectives and coping with environmental factors like sky interference and the presence of far and small objects. Besides, ensuring high visibility distance in 3D depth is crucial for safe flights in various settings, including airports, cities, and fields. However, local weather conditions can change rapidly during flights, leading to visibility issues caused by fog and clouds. Due to the cost of visibility measurement sensors, lower-cost methods using portable apparatus are desired for flight routines. Therefore, this paper proposes a camera-based visibility and weather condition estimation approach using complementary multiple Deep Learning (DL) and Vision Language Models (VLM) under adversarial conditions. Experimental results show the superiority of enhanced 2D/3D captions with physical scales over SOTA VLMs.

Related Material


[pdf]
[bibtex]
@InProceedings{Sakaino_2023_ICCV, author = {Sakaino, Hidetomo}, title = {Dynamic Texts From UAV Perspective Natural Images}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2023}, pages = {2070-2081} }