GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots

Simranjit Singh, Michael Fore, Dimitrios Stamoulis; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 585-594

Abstract


Geospatial Copilots unlock unprecedented potential for performing Earth Observation (EO) applications through natural language instructions. However existing agents rely on overly simplified single tasks and template-based prompts creating a disconnect with real-world scenarios. In this work we present GeoLLM-Engine an environment for tool-augmented agents with intricate tasks routinely executed by analysts on remote sensing platforms. We enrich our environment with geospatial API tools dynamic maps/UIs and external multimodal knowledge bases to properly gauge an agent's proficiency in interpreting realistic high-level natural language commands and its functional correctness in task completions. By alleviating overheads typically associated with human-in-the-loop benchmark curation we harness our massively parallel engine across 100 GPT-4-Turbo nodes scaling to over half a million diverse multi-tool tasks and across 1.1 million satellite images. By moving beyond traditional single-task image-caption paradigms we investigate state-of-the-art agents and prompting techniques against long-horizon prompts.

Related Material


[pdf]
[bibtex]
@InProceedings{Singh_2024_CVPR, author = {Singh, Simranjit and Fore, Michael and Stamoulis, Dimitrios}, title = {GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {585-594} }