-
[pdf]
[supp]
[bibtex]@InProceedings{Mehrab_2025_CVPR, author = {Mehrab, Kazi Sajeed and Maruf, M. and Daw, Arka and Neog, Abhilash and Manogaran, Harish Babu and Khurana, Mridul and Feng, Zhenyang and Altintas, Bahadir and Bakis, Yasin and Campolongo, Elizabeth G and Thompson, Matthew J and Wang, Xiaojun and Lapp, Hilmar and Berger-Wolf, Tanya and Mabee, Paula and Bart, Henry and Chao, Wei-Lun and Dahdul, Wasila M and Karpatne, Anuj}, title = {Fish-Vista: A Multi-Purpose Dataset for Understanding \& Identification of Traits from Images}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {24275-24285} }
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Abstract
We introduce Fish-Visual Trait Analysis (Fish-Vista), the first organismal image dataset designed for the analysis of visual traits of aquatic species directly from images using machine learning and computer vision methods. Fish-Vista contains 69,269 annotated images spanning 4,316 fish species, curated and organized to serve three downstream tasks: species classification, trait identification, and trait segmentation. Our work makes two key contributions. First, we provide a fully reproducible data processing pipeline to process fish images sourced from various museum collections, contributing to the advancement of AI in biodiversity science. We annotate the images with carefully curated labels from biological databases and manual annotations to create an AI-ready dataset of visual traits. Second, our work offers fertile grounds for researchers to develop novel methods for a variety of problems in computer vision such as handling long-tailed distributions, out-of-distribution generalization, learning with weak labels, explainable AI, and segmenting small objects. Dataset and code for Fish-Vista are available at https://github.com/Imageomics/Fish-Vista
Related Material