Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data

Pal, Arghya; Balasubramanian, Vineeth N.

Arghya Pal, Vineeth N. Balasubramanian; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 1556-1565

Abstract

Paucity of large curated hand labeled training data forms a major bottleneck in the deployment of machine learning models in computer vision and other fields. Recent work (Data Programming) has shown how distant supervision signals in the form of labeling functions can be used to obtain labels for given data in near-constant time. In this work, we present Adversarial Data Programming (ADP), which presents an adversarial methodology to generate data as well as a curated aggregated label, given a set of weak labeling functions. We validated our method on the MNIST, Fashion MNIST, CIFAR 10 and SVHN datasets, and it outperformed many state-of-the-art models. We conducted extensive experiments to study its usefulness, as well as showed how the proposed ADP framework can be used for transfer learning as well as multitask learning, where data from two domains are generated simultaneously using the framework along with the label information. Our future work will involve understanding the theoretical implications of this new framework from a game-theoretic perspective, as well as explore the performance of the method on more complex datasets.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Pal_2018_CVPR,
author = {Pal, Arghya and Balasubramanian, Vineeth N.},
title = {Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}