Leveraging Vision Language Models for Specialized Agricultural Tasks

Arshad, Muhammad Arbab; Jubery, Talukder Zaki; Roy, Tirtho; Nassiri, Rim; Singh, Asheesh K.; Singh, Arti; Hegde, Chinmay; Ganapathysubramanian, Baskar; Balu, Aditya; Krishnamurthy, Adarsh; Sarkar, Soumik

Muhammad Arbab Arshad, Talukder Zaki Jubery, Tirtho Roy, Rim Nassiri, Asheesh K. Singh, Arti Singh, Chinmay Hegde, Baskar Ganapathysubramanian, Aditya Balu, Adarsh Krishnamurthy, Soumik Sarkar; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 6320-6329

Abstract

As Vision Language Models (VLMs) become increasingly accessible to farmers and agricultural experts there is a growing need to evaluate their potential in specialized tasks. We present AgEval a comprehensive benchmark for assessing VLMs' capabilities in plant stress phenotyping offering a solution to the challenge of limited annotated data in agriculture. Our study explores how general-purpose VLMs can be leveraged for domain-specific tasks with only a few annotated examples providing insights into their behavior and adaptability. AgEval encompasses 12 diverse plant stress phenotyping tasks evaluating zero-shot and few-shot in-context learning performance of state-of-the-art models including Claude GPT Gemini and LLaVA. Our results demonstrate VLMs' rapid adaptability to specialized tasks with the best-performing model showing an increase in F1 scores from 46.24% to 73.37% in 8-shot identification. To quantify performance disparities across classes we introduce metrics such as the coefficient of variation (CV) revealing that VLMs' training impacts classes differently with CV ranging from 26.02% to 58.03%. We also find that strategic example selection enhances model reliability with exact category examples improving F1 scores by 15.38% on average. AgEval establishes a framework for assessing VLMs in agricultural applications offering valuable benchmarks for future evaluations. Our findings suggest that VLMs with minimal few-shot examples show promise as a viable alternative to traditional specialized models in plant stress phenotyping while also highlighting areas for further refinement. Results and benchmark details are available at: https://github.com/arbab-ml/AgEval

Related Material

[pdf] [supp]

[bibtex]

@InProceedings{Arshad_2025_WACV, author = {Arshad, Muhammad Arbab and Jubery, Talukder Zaki and Roy, Tirtho and Nassiri, Rim and Singh, Asheesh K. and Singh, Arti and Hegde, Chinmay and Ganapathysubramanian, Baskar and Balu, Aditya and Krishnamurthy, Adarsh and Sarkar, Soumik}, title = {Leveraging Vision Language Models for Specialized Agricultural Tasks}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {6320-6329} }