Extracting Gene Expression Patterns from High-Dimensional Transcriptomic Data
ORAL
Abstract
Whole-embryo transcriptomic analysis can reveal coordinated gene expression programs that underlie morphogenetic development. A central challenge in embryo-scale transcriptomics is to distinguish biologically meaningful heterogeneity from technical noise inherent to high-dimensional datasets, while simultaneously recovering the global topological structure that organizes cellular diversity. Here, we present a method for extracting low-dimensional representations of large-scale gene expression datasets using: (i) an initial dimensionality reduction step to isolate biological signal from experimental noise; and (ii) an autoencoder-based machine learning framework that learns a low-dimensional parameterization of the data manifold, with an implicit bias toward preserving global neighborhood relationships. We validate our approach in Drosophila melanogaster by comparing inferred marker-gene expression patterns to in situ hybridization data and demonstrate the applicability of our method to higher-resolution, single-cell transcriptomic datasets.
*National Science Foundation, Grant PHY-2210612
–
Presenters
-
Jeremy Lauro
- University of California, Santa Barbara