Evaluating Transferability in Retrieval Tasks: An Approach Using MMD and Kernel Methods

Mengyu Dai, Amir Hossein Raffiee, Aashish Jain, Joshua Correa; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 22390-22400

Abstract


Retrieval tasks play central roles in real-world machine learning systems such as search engines recommender systems and retrieval-augmented generation (RAG). Achieving decent performance in these tasks often requires fine-tuning various pre-trained models on specific datasets and selecting the best candidate a process that can be both time and resource-consuming. To tackle the problem we introduce a novel and efficient method called RetMMD that leverages Maximum Mean Discrepancy (MMD) and kernel methods to assess the transferability of pretrained models in retrieval tasks. RetMMD is calculated on pretrained model and target dataset without any fine-tuning involved. Specifically given some query we quantify the distribution discrepancy between relevant and irrelevant document embeddings by estimating the similarities within their mappings in the fine-tuned embedding space through kernel method. This discrepancy is averaged over multiple queries taking into account the distribution characteristics of the target dataset. Experiments suggest that the proposed metric calculated on pre-trained models closely aligns with retrieval performance post-fine-tuning. The observation holds across a variety of datasets including image text and multi-modal domains indicating the potential of using MMD and kernel methods for transfer learning evaluation in retrieval scenarios. In addition we also design a way of evaluating dataset transferability for retrieval tasks with experimental results demonstrating the effectiveness of the proposed approach.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Dai_2024_CVPR, author = {Dai, Mengyu and Raffiee, Amir Hossein and Jain, Aashish and Correa, Joshua}, title = {Evaluating Transferability in Retrieval Tasks: An Approach Using MMD and Kernel Methods}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {22390-22400} }