-
[pdf]
[arXiv]
[bibtex]@InProceedings{Qazi_2024_CVPR, author = {Qazi, Ahmed and Razzaq, Taha and Iqbal, Asim}, title = {AnimalFormer: Multimodal Vision Framework for Behavior-based Precision Livestock Farming}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {7973-7982} }
AnimalFormer: Multimodal Vision Framework for Behavior-based Precision Livestock Farming
Abstract
We introduce a multimodal vision framework for precision livestock farming harnessing the power of GroundingDINO HQSAM and ViTPose models. This integrated suite enables comprehensive behavioral analytics from video data without invasive animal tagging. GroundingDINO generates accurate bounding boxes around livestock while HQSAM segments individual animals within these boxes. ViTPose estimates key body points facilitating posture and movement analysis. Demonstrated on a sheep dataset with grazing running sitting standing and walking activities our framework extracts invaluable insights: activity and grazing patterns interaction dynamics and detailed postural evaluations. Applicable across species and video resolutions this framework revolutionizes non-invasive livestock monitoring for activity detection counting health assessments and posture analyses. It empowers data-driven farm management optimizing animal welfare and productivity through AI-powered behavioral understanding.
Related Material