MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying

Ryan D. Burgert, Brian L. Price, Jason Kuen, Yijun Li, Michael S. Ryoo; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 22595-22604

Abstract


We introduce MAGICK a large-scale dataset of generated objects with high-quality alpha mattes. While image generation methods have produced segmentations they cannot generate alpha mattes with accurate details in hair fur and transparencies. This is likely due to the small size of current alpha matting datasets and the difficulty in obtaining ground-truth alpha. We propose a scalable method for synthesizing images of objects with high-quality alpha that can be used as a ground-truth dataset. A key idea is to generate objects on a single-colored background so chroma keying approaches can be used to extract the alpha. However this faces several challenges including that current text-to-image generation methods cannot create images that can be easily chroma keyed and that chroma keying is an underconstrained problem that generally requires manual intervention for high-quality results. We address this using a combination of generation and alpha extraction methods. Using our method we generate a dataset of 150000 objects with alpha. We show the utility of our dataset by training an alpha-to-rgb generation method that outperforms baselines. Please see our project website at https://ryanndagreat.github.io/MAGICK/.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Burgert_2024_CVPR, author = {Burgert, Ryan D. and Price, Brian L. and Kuen, Jason and Li, Yijun and Ryoo, Michael S.}, title = {MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {22595-22604} }