SLoSH: Set Locality Sensitive Hashing via Sliced-Wasserstein Embeddings

Yuzhe Lu, Xinran Liu, Andrea Soltoggio, Soheil Kolouri; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, pp. 2566-2576

Abstract


Learning from set-structured data is an essential problem with many applications in machine learning and computer vision. This paper focuses on non-parametric and data-independent learning from set-structured data using approximate nearest neighbor (ANN) solutions, particularly locality-sensitive hashing. We consider the problem of set retrieval from an input set query. Such a retrieval problem requires: 1) an efficient mechanism to calculate the distances/dissimilarities between sets, and 2) an appropriate data structure for fast nearest-neighbor search. To that end, we propose to use Sliced-Wasserstein embedding as a computationally efficient set-2-vector operator that enables downstream ANN, with theoretical guarantees. The set elements are treated as samples from an unknown underlying distribution, and the Sliced-Wasserstein distance is used to compare sets. We demonstrate the effectiveness of our algorithm, denoted as Set Locality Sensitive Hashing (SLoSH), on various set retrieval datasets and compare our proposed embedding with standard set embedding approaches, including Generalized Mean (GeM) embedding/pooling, Featurewise Sort Pooling (FSPool), Covariance Pooling, and Wasserstein embedding and show consistent improvement in retrieval results.

Related Material


[pdf] [supp] [arXiv]
[bibtex]
@InProceedings{Lu_2024_WACV, author = {Lu, Yuzhe and Liu, Xinran and Soltoggio, Andrea and Kolouri, Soheil}, title = {SLoSH: Set Locality Sensitive Hashing via Sliced-Wasserstein Embeddings}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2024}, pages = {2566-2576} }