Exploring Genomic Datasets

Latest revision as of 06:31, 2 February 2021

Abstract

Genomic data management is focused on achieving high performance over big datasets using batch, cloud-based architectures; this enables the execution of massive pipelines, but hampers the capability of exploring the solution space when it is not well-defined, by choosing different experimental samples or query extraction parameters. We present PyGMQL, a Python-based interoperability software layer that enables testing of experimental pipelines; PyGMQL solves the impedance mismatch between a batch execution environment and the agile programming style of Python, and provides transparency of access when exploration requires integrating local and remote resources. Wrapping PyGMQL and Python primitives within Jupyter notebooks guarantees reproducibility of the pipeline when used in different contexts or by different scientists. The software is freely available at https://github.com/DEIB-GECO/PyGMQL.

Original document

The different versions of the original document can be found in:

http://hdl.handle.net/11311/1095264

https://re.public.polimi.it/retrieve/handle/11311/1095264/392193/Nanni.pdf

http://dl.acm.org/ft_gateway.cfm?id=3214710&ftid=2052771&dwn=1,

http://dx.doi.org/10.1145/3214708.3214710 under the license http://www.acm.org/publications/policies/copyright_policy#Background

https://dblp.uni-trier.de/db/conf/sigmod/exploredb2018.html#NanniPCC18,

https://re.public.polimi.it/handle/11311/1095264,

https://academic.microsoft.com/#/detail/2943444417

Latest revision as of 06:31, 2 February 2021

Abstract

Original document

Document information

Document Score

Share this document

Keywords

claim authorship

Revision as of 06:31, 2 February 2021 (view source) Scipediacontent (talk \| contribs) (Created page with " == Abstract == Genomic data management is focused on achieving high performance over big datasets using batch, cloud-based architectures; this enables the execution of massi...")	Latest revision as of 06:31, 2 February 2021 (view source) Scipediacontent (talk \| contribs) m (Scipediacontent moved page Draft Content 595640951 to Pinoli et al 2018a)
(No difference)