GEMS: Generating Efficient Meta-Subnets

Varad Pimpalkhute, Shruti Kunde, Rekha Singhal; Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2023, pp. 5315-5323


Gradient-based meta learners (GBML) such as MAML aim to learn a model initialization across similar tasks, such that the model generalizes well on unseen tasks sampled from the same distribution with few gradient updates. A limitation of GBML is its inability to adapt to real-world applications where input tasks are sampled from multiple distributions. An existing effort learns N initializations for tasks sampled from N distributions; roughly increasing training time by a factor of N. Instead, we use a single model initialization to learn distribution-specific parameters for every input task. This reduces negative knowledge transfer across distributions and overall computational cost. Specifically, we explore two ways of efficiently learning on multi-distribution tasks: 1) Binary Mask Perceptron (BMP), which learns distribution-specific layers, 2) Multi-modal Supermask (MMSUP), which learns distribution-specific parameters. We evaluate the performance of the proposed framework (GEMS) on few-shot vision classification tasks. The experimental results demonstrate a significant improvement in accuracy and reduction in training time over existing state of the art algorithms on quasi-benchmark tasks.

Related Material

[pdf] [supp]
@InProceedings{Pimpalkhute_2023_WACV, author = {Pimpalkhute, Varad and Kunde, Shruti and Singhal, Rekha}, title = {GEMS: Generating Efficient Meta-Subnets}, booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}, month = {January}, year = {2023}, pages = {5315-5323} }