Revisiting Deep Archetypal Analysis for Phenotype Discovery in High Content Imaging

Mario Wieser, Daniel Siegismund, Stephan Steigele; Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 3802-3811

Abstract


The discovery of unique treatment candidates for complex diseases is a challenging task for current drug discovery programs. Biopharma research has developed automated and scalable screening assays of cell culture models to screen thousands of drug candidates in parallel e.g. by considering bio-image based assays. However the large amount of data hinders a systematic review by human experts to distinguish between different disease and healthy phenotypes. A prevalent approach to uncover phenotypic endpoints in a dataset is based on the concept of archetypal analysis which seeks for extremal points in a dataset. State-of-the-art non-linear archetypal methods based on variational autoencoders require k - 1 latent dimensions to encode k archetypes. However in high content imaging we frequently require a significantly larger number of latent dimensions than archetypes to encode HCIs which results in weak latent representations and ambiguous archetypes. To overcome this limitation we propose to relax the simplex constraint in the latent space to a unit hypersphere and learn the respective archetypes based on online dictionary learning. Extensive experiments on two industry-relevant assays and a synthetic MNIST example demonstrate that our method outperforms state-of-the-art deep archetypal analysis approaches.

Related Material


[pdf] [supp]
[bibtex]
@InProceedings{Wieser_2025_WACV, author = {Wieser, Mario and Siegismund, Daniel and Steigele, Stephan}, title = {Revisiting Deep Archetypal Analysis for Phenotype Discovery in High Content Imaging}, booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)}, month = {February}, year = {2025}, pages = {3802-3811} }