RNA-seq data analysis pipelines are generally composed of sequence alignment, expression quantification, expression normalization, and differentially expressed gene (DEG) detection. Each step has numerous specific tools or algorithms, so we cannot explore all combinatorial pipelines and provide a comprehensive comparison of pipeline performance. To understand the mechanism of RNA-seq data analysis pipelines and provide some useful information for pipeline selection, we believe it is necessary to analyze the interactions among pipeline components. In this paper, by combining different alignment algorithms with the same quantification, normalization, and DEG detection tools, we construct nine RNA-seq pipelines to analyze the impact of RNA-seq alignment on downstream applications of gene expression estimates. Specifically, we find moderate linear correlation between the number of DEGs detected and the percentage of reads aligned with zero mismatch.
The different versions of the original document can be found in:
Published on 01/01/2014
Volume 2014, 2014
DOI: 10.1109/globalsip.2014.7032351
Licence: CC BY-NC-SA license
Are you one of the authors of this document?