Modeling intrinsic biases in high-throughput sequencing data for chromatin accessibility

Shengen Hu; Chongzhi Zang

Modeling intrinsic biases in high-throughput sequencing data for chromatin accessibility

ORAL

Abstract

Genome-wide profiling of chromatin accessibility with the assay for transposase-accessible chromatin using sequencing (ATAC-seq) or DNaseI hypersensitivity sequencing (DNase-seq) has been widely used for studying regulatory DNA elements and transcriptional regulation in many cellular systems. Efficient and thorough computational analysis is essential for extracting biological information from such high-throughput sequencing data. It has been reported that DNase cleavage of DNA has sequence preferences that can significantly affect the footprint patterns at transcription factor binding sites in genomic profiles. We found that enzymatic sequence biases commonly exist in both bulk and single-cell chromatin accessibility profiling data. Using a regular simplex encoding model, we developed a quantitative approach for accurate characterization and systematic correction of intrinsic sequence biases contained in ATAC-seq and DNase-seq data. This approach can be applied in bioinformatics for improved analysis of high-throughput chromatin accessibility sequencing.

March 7, 2019, 12:27 PM – March 7, 2019, 12:39 PM

Presenters

Chongzhi Zang

University of Virginia

Authors

Shengen Hu

University of Virginia
Chongzhi Zang

University of Virginia