Prototype-based Dataset Comparison

Nanne van Noord; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 1944-1954


Dataset summarisation is a fruitful approach to dataset inspection. However, when applied to a single dataset the discovery of visual concepts is restricted to those most prominent. We argue that a comparative approach can expand upon this paradigm to enable richer forms of dataset inspection that go beyond the most prominent concepts. To enable dataset comparison we present a module that learns concept-level prototypes across datasets. We leverage self-supervised learning to discover these prototypes without supervision, and we demonstrate the benefits of our approach in two case-studies. Our findings show that dataset comparison extends dataset inspection and we hope to encourage more works in this direction. Code and usage instructions available at

Related Material

[pdf] [supp]
@InProceedings{van_Noord_2023_ICCV, author = {van Noord, Nanne}, title = {Prototype-based Dataset Comparison}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {1944-1954} }