Abstract

Background Chromatin accessibility profiling assays such as ATAC-seq and DNase1-seq offer the opportunity to rapidly characterize the regulatory state of the genome at a single nucleotide resolution. Optimization of molecular protocols has enabled the molecular biologist to produce next-generation sequencing libraries in several hours, leaving the analysis of sequencing data as the primary obstacle to wide-scale deployment of accessibility profiling assays. To address this obstacle we have developed an optimized and efficient pipeline for the analysis of ATAC-seq and DNase1-seq data. Results We executed a multi-dimensional grid-search on the NIH Biowulf supercomputing cluster to assess the impact of parameter selection on biological reproducibility and ChIP-seq recovery by analyzing 4560 pipeline configurations. Our analysis improved ChIP-seq recovery by 15% for ATAC-seq and 3% for DNase1-seq and determined that PCR duplicate removal improves biological reproducibility by 36% without significant costs in footprinting transcription factors. Our analyses of down sampled reads identified a point of diminishing returns for increased library sequencing depth, with 95% of the ChIP-seq data of a 200 million read footprinting library recovered by 160 million reads. Conclusions We present optimized ATAC-seq and DNase-seq pipelines in both Snakemake and bash formats as well as optimal sequencing depths for ATAC-seq and DNase-seq projects. The optimized ATAC-seq and DNase1-seq analysis pipelines, parameters, and ground-truth ChIP-seq datasets have been made available for deployment and future algorithmic profiling. Electronic supplementary material The online version of this article (10.1186/s12864-018-4943-z) contains supplementary material, which is available to authorized users.

Document type: Article

Full document

The PDF file did not load properly or your web browser does not support viewing PDF files. Download directly to your device: Download PDF document

Original document

The different versions of the original document can be found in:

https://doaj.org/toc/1471-2164 under the license cc-by
http://link.springer.com/article/10.1186/s12864-018-4943-z/fulltext.html,
http://dx.doi.org/10.1186/s12864-018-4943-z
https://pubmed.ncbi.nlm.nih.gov/30064353,
https://www.ncbi.nlm.nih.gov/pubmed/30064353,
https://academic.microsoft.com/#/detail/2885585984 under the license http://creativecommons.org/licenses/by/4.0/
Back to Top

Document information

Published on 01/01/2018

Volume 2018, 2018
DOI: 10.1186/s12864-018-4943-z
Licence: Other

Document Score

0

Views 6
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?