Exploiting a Percolation Transition for the Clustering of Noisy Gene Expression Data

ORAL

Abstract

Gene expression largely determines the fate of individual cells and ultimately influences development and behaviour of entire organisms. Thus, the ability to assess the abundance of mRNA intermediates of gene expression on a genome-wide scale (down to single cell resolution) promises to revolutionize our understanding of biological processes. While the collection of such data is rapidly growing thanks to experimental innovations, researchers face the challenge of identifying meaningful patterns and often need to discriminate subtle signals from a high noise floor. Here, we describe a density-based clustering approach that takes advantage of a percolation transition generically arising in random data to help discriminate meaningful patterns of variation from noise. The method allows clustering parameters to be defined by statistical properties of the data itself, thus obviating arbitrary parameter choices or detailed knowledge of experimental noise sources. By applying this approach to data from single cells to whole organisms, we reveal known as well as unknown modules of co-regulated genes.

Presenters

  • Steffen Werner

    AMOLF

Authors

  • Steffen Werner

    AMOLF

  • Tom S Shimizu

    AMOLF, Deparment of Living Matter, AMOLF

  • Greg Stephens

    Physics, Vrije Universiteit & OIST Graduate University, Physics, Vrije Univ (Free Univ) Amsterdam, Vrije Universiteit (Amsterdam) & OIST Graduate University (Okinawa), Vrije Universiteit