Synthetic Data Generation Using Imitation Training

Aman Kishore, Tae Eun Choe, Junghyun Kwon, Minwoo Park, Pengfei Hao, Akshita Mittel; Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 3078-3086


We propose a strategic approach to generate synthetic data in order to improve machine learning algorithms such as Deep Neural Networks (DNN). Utilization of synthetic data has shown promising results yet there are no specific rules or recipes on how to generate and cook synthetic data. We propose imitation training as a guideline of synthetic data generation to add more underrepresented entities and balance the data distribution for DNN to handle corner cases and resolve long tail problems. The proposed imitation training has a circular process with three main steps: First, the existing system is evaluated and failure cases such as false positive and false negative detections are sorted out; Secondly, synthetic data imitating such failure cases is created with domain randomization; Thirdly, we train a network with the existing data and the newly added synthetic data; We repeat these three steps until the evaluation metric converges. We validated the approach by experimenting on object detection in autonomous driving.

Related Material

@InProceedings{Kishore_2021_ICCV, author = {Kishore, Aman and Choe, Tae Eun and Kwon, Junghyun and Park, Minwoo and Hao, Pengfei and Mittel, Akshita}, title = {Synthetic Data Generation Using Imitation Training}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = {October}, year = {2021}, pages = {3078-3086} }