Abstract

In this paper we describe our initial experiences in the Cloud-e-Genome project with moving the whole exome sequencing pipeline from the scripted HPC-based solution to a workflow enactment system running in the cloud. We discuss shortcomings of the existing approach based on scripts and list benefits that a workflow-based solution can provide. Despite the effort it involved to wrap all required tools in the form of workflow blocks and the restrictions of the dataflow model used to represent workflows we expect the migration to significantly improve the current status of the pipeline. Our target is to enable flexibility, traceability and reproducibility of the solution, so that it can better fit the evolution of tools, data and pipeline itself and allow us to run it at national scale. This work will become foundation for the more complete system that includes variant filtering and interpretation for the diagnostic purposes.


Original document

The different versions of the original document can be found in:

http://dx.doi.org/10.1109/ccgrid.2014.128
https://eprint.ncl.ac.uk/216149,
https://ieeexplore.ieee.org/document/6846521,
https://academic.microsoft.com/#/detail/2025676854
Back to Top

Document information

Published on 01/01/2014

Volume 2014, 2014
DOI: 10.1109/ccgrid.2014.128
Licence: CC BY-NC-SA license

Document Score

0

Views 1
Recommendations 0

Share this document

Keywords

claim authorship

Are you one of the authors of this document?