F?D: On Understanding the Role of Deep Feature Spaces on Face Generation Evaluation

Kabra, Krish; Balakrishnan, Guha

Krish Kabra, Guha Balakrishnan; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 8327-8332

Abstract

Perceptual metrics like the Frechet Inception Distance (FID) are widely used to assess the similarity between synthetically generated and ground truth (real) images. The key idea behind these metrics is to compute errors in a deep feature space that captures perceptually and semantically rich image features. Despite their popularity the effect that different deep features and their design choices have on a perceptual metric has not been well studied. In this work we perform a causal analysis linking differences in semantic attributes and distortions between face image distributions to Frechet distances (FD) using several popular deep feature spaces. A key component of our analysis is the creation of synthetic counterfactual faces using deep face generators. Our experiments show that the FD is heavily influenced by its feature space's training dataset and objective function. For example FD using features extracted from ImageNet-trained models heavily emphasizes hats over regions like the eyes and mouth. Moreover FD using features from a face gender classifier emphasizes hair length more than distances in an identity (recognition) feature space.

Related Material

[pdf] [supp] [arXiv]

[bibtex]

@InProceedings{Kabra_2024_CVPR, author = {Kabra, Krish and Balakrishnan, Guha}, title = {F?D: On Understanding the Role of Deep Feature Spaces on Face Generation Evaluation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {8327-8332} }