ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders

Surojit Saha, Sarang Joshi, Ross Whitaker; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 889-898

Abstract


The variational autoencoder (VAE) is a popular deep latent-variable model (DLVM) due to its simple yet effective formulation for modeling the data distribution. Moreover optimizing the VAE objective function is more manageable than other DLVMs. The bottleneck dimension of the VAE is a crucial design choice and it has strong ramifications for the model's performance such as finding the hidden explanatory factors of a dataset using the representations learned by the VAE. However the size of the latent dimension of the VAE is often treated as a hyperparameter estimated empirically through trial and error. To this end we propose a statistical formulation to discover the relevant latent factors required for modeling a dataset. In this work we use a hierarchical prior in the latent space that estimates the variance of the latent axes using the encoded data which identifies the relevant latent dimensions. For this we replace the fixed prior in the VAE objective function with a hierarchical prior keeping the remainder of the formulation unchanged. We call the proposed method the automatic relevancy detection in the variational autoencoder (ARD-VAE). We demonstrate the efficacy of the ARD-VAE on multiple benchmark datasets in finding the relevant latent dimensions and their effect on different evaluation metrics such as FID score and disentanglement analysis.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Saha_2025_WACV, author = {Saha, Surojit and Joshi, Sarang and Whitaker, Ross}, title = {ARD-VAE: A Statistical Formulation to Find the Relevant Latent Dimensions of Variational Autoencoders}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {889-898} }