Efficient Two-Stage Model Retraining for Machine Unlearning

Junyaup Kim, Simon S. Woo; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 4361-4369


With the rise of the General Data Protection Regulation (GDPR), user data holders should guarantee the "individual's right to be forgotten". It means user data holders must completely remove user data when they receive the request. However, enabling a deep learning model to exclude specific data used during training is challenging. We cannot easily define the meaning of "forgetting" in deep learning and how to achieve it. To address this issue, we propose an efficient machine unlearning architecture to be used for computer vision classification models. Our approach consists of two-stage models, where in the first stage we enables a deep learning model that loses information with contrastive labels in the requested dataset. Second, we retrain the first stage output model with knowledge distillation (KD). Using this two-stage approach, we can substantiate the removal or forgetness of the requested dataset in the deep learning model. With various datasets used for multimedia applications, we demonstrate that our approach achieves performance on par or even higher accuracy than the original model, while effectively removing the requested data.

Related Material

@InProceedings{Kim_2022_CVPR, author = {Kim, Junyaup and Woo, Simon S.}, title = {Efficient Two-Stage Model Retraining for Machine Unlearning}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2022}, pages = {4361-4369} }