CountingDINO: A Training-free Pipeline for Class-Agnostic Counting using Unsupervised Backbones

Giacomo Pacini, Lorenzo Bianchi, Luca Ciampi, Nicola Messina, Giuseppe Amato, Fabrizio Falchi; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026, pp. 806-815

Abstract


Class-agnostic counting (CAC) aims to estimate the number of objects in images without being restricted to predefined categories. However, while current exemplar-based CAC methods offer flexibility at inference time, they still heavily rely on labeled data for training, which limits scalability and generalization to many downstream use cases. In this paper, we introduce CountingDINO, the first training-free exemplar-based CAC framework that exploits a fully unsupervised feature extractor. Specifically, our approach employs self-supervised vision-only backbones to extract object-aware features, and it eliminates the need for annotated data throughout the entire proposed pipeline. At inference time, we extract latent object prototypes via ROI-Align from DINO features and use them as convolutional kernels to generate similarity maps. These are then transformed into density maps through a simple yet effective normalization scheme. We evaluate our approach on the FSC-147 and CARPK benchmarks, where we consistently outperform a baseline based on an SOTA unsupervised object detector under the same label- and training-free setting. Additionally, we achieve competitive results -- and in some cases surpass -- training-free methods that rely on supervised backbones, non-training-free unsupervised methods, as well as several fully supervised SOTA approaches. This demonstrates that label- and training-free CAC can be both scalable and effective. Website: https://lorebianchi98.github.io/CountingDINO/.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Pacini_2026_WACV, author = {Pacini, Giacomo and Bianchi, Lorenzo and Ciampi, Luca and Messina, Nicola and Amato, Giuseppe and Falchi, Fabrizio}, title = {CountingDINO: A Training-free Pipeline for Class-Agnostic Counting using Unsupervised Backbones}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {March}, year = {2026}, pages = {806-815} }