Discrete Representation Learning for Modeling Imaging-Based Spatial Transcriptomics Data
Imaging-based spatial transcriptomics (ST) provides single-transcript-level spatial resolution for hundreds of genes, unlike sequencing-based ST technologies whose resolution is limited to physical capture regions (spots) on slides. Existing methods to identify patterns of interest in imaging-based ST data are built as extensions of single cell analysis methods, mostly ignoring valuable spatial information encoded in the raw imaging data. Here we present a discrete representation learning approach for modeling spatial gene expression patterns in ST datasets. By employing raw coordinates of detected transcripts and positional encoding of cell centroids as inputs, we learn discrete representations using Vector Quantized-Variational Autoencoder (VQ-VAE) to extract multi-scale structures from fluorescence in situ hybridization (FISH) based ST datasets. We demonstrate the usefulness of discrete representations in terms of the quality of embedding of ST data as well as improved performance on downstream tasks for extracting biologically meaningful cellular neighborhoods and spatially variable genes.