-
[pdf]
[bibtex]@InProceedings{Singh_2024_CVPR, author = {Singh, Simranjit and Fore, Michael and Stamoulis, Dimitrios}, title = {GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {585-594} }
GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots
Abstract
Geospatial Copilots unlock unprecedented potential for performing Earth Observation (EO) applications through natural language instructions. However existing agents rely on overly simplified single tasks and template-based prompts creating a disconnect with real-world scenarios. In this work we present GeoLLM-Engine an environment for tool-augmented agents with intricate tasks routinely executed by analysts on remote sensing platforms. We enrich our environment with geospatial API tools dynamic maps/UIs and external multimodal knowledge bases to properly gauge an agent's proficiency in interpreting realistic high-level natural language commands and its functional correctness in task completions. By alleviating overheads typically associated with human-in-the-loop benchmark curation we harness our massively parallel engine across 100 GPT-4-Turbo nodes scaling to over half a million diverse multi-tool tasks and across 1.1 million satellite images. By moving beyond traditional single-task image-caption paradigms we investigate state-of-the-art agents and prompting techniques against long-horizon prompts.
Related Material