Pre-training Vision Models with Mandelbulb Variations

Benjamin Naoto Chiche, Yuto Horikawa, Ryo Fujita; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 22062-22071

Abstract


The use of models that have been pre-trained on natural image datasets like ImageNet may face some limitations. First this use may be restricted due to copyright and license on the training images and privacy laws. Second these datasets and models may incorporate societal and ethical biases. Formula-driven supervised learning (FDSL) enables model pre-training to circumvent these issues. This consists of generating a synthetic image dataset based on mathematical formulae and pre-training the model on it. In this work we propose novel FDSL datasets based on Mandelbulb Variations. These datasets contain RGB images that are projections of colored objects deriving from the 3D Mandelbulb fractal. Pre-training ResNet-50 on one of our proposed datasets MandelbulbVAR-1k enables an average top-1 accuracy over target classification datasets that is at least 1% higher than pre-training on existing FDSL datasets. With regard to anomaly detection on MVTec AD pre-training the WideResNet-50 backbone on MandelbulbVAR-1k enables PatchCore to achieve 97.2% average image-level AUROC. This is only 1.9% lower than pre-training on ImageNet-1k (99.1%) and 4.5% higher than pre-training on the second-best performing FDSL dataset i.e. VisualAtom-1k (92.7%). Regarding Vision Transformer (ViT) pre-training another dataset that we propose and coin MandelbulbVAR-Hybrid-21k enables ViT-Base to achieve 82.2% top-1 accuracy on ImageNet-1k which is 0.4% higher than pre-training on ImageNet-21k (81.8%) and only 0.1% lower than pre-training on VisualAtom-1k (82.3%).

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Chiche_2024_CVPR, author = {Chiche, Benjamin Naoto and Horikawa, Yuto and Fujita, Ryo}, title = {Pre-training Vision Models with Mandelbulb Variations}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2024}, pages = {22062-22071} }