(Created page with " == Abstract == The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-54420-0_31 A large up-to-date compendium of integrated genomic data is...")
 
 
(One intermediate revision by the same user not shown)
Line 3: Line 3:
  
 
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-54420-0_31 A large up-to-date compendium of integrated genomic data is often required for biological data analysis. The compendium can be tens of terabytes in size, and must often be frequently updated with new experimental or meta-data. Manual compendium update is cumbersome, requires a lot of unnecessary computation, and it may result in errors or inconsistencies in the compendium. We propose a transparent file based approach for adding incremental update ca-pabilities to unmodified genomics data analysis tools and pipeline workflow managers. This approach is implemented in the GeStore system. We evaluate GeStore using a real world genomics compendium. Our results show that it is easy to add incremental updates to genomics data processing pipelines, and that incremental updates can reduce the computation time such that it becomes prac-tical to maintain large-scale up-to-date genomics compendia on small clusters.
 
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-54420-0_31 A large up-to-date compendium of integrated genomic data is often required for biological data analysis. The compendium can be tens of terabytes in size, and must often be frequently updated with new experimental or meta-data. Manual compendium update is cumbersome, requires a lot of unnecessary computation, and it may result in errors or inconsistencies in the compendium. We propose a transparent file based approach for adding incremental update ca-pabilities to unmodified genomics data analysis tools and pipeline workflow managers. This approach is implemented in the GeStore system. We evaluate GeStore using a real world genomics compendium. Our results show that it is easy to add incremental updates to genomics data processing pipelines, and that incremental updates can reduce the computation time such that it becomes prac-tical to maintain large-scale up-to-date genomics compendia on small clusters.
 
Document type: Part of book or chapter of book
 
 
== Full document ==
 
<pdf>Media:Draft_Content_772455669-beopen768-5850-document.pdf</pdf>
 
  
  
Line 13: Line 8:
  
 
The different versions of the original document can be found in:
 
The different versions of the original document can be found in:
 
* [http://hdl.handle.net/10037/6902 http://hdl.handle.net/10037/6902]
 
 
* [http://hdl.handle.net/10037/7563 http://hdl.handle.net/10037/7563]
 
  
 
* [https://hdl.handle.net/10037/6902 https://hdl.handle.net/10037/6902]
 
* [https://hdl.handle.net/10037/6902 https://hdl.handle.net/10037/6902]
Line 23: Line 14:
  
 
* [https://munin.uit.no/bitstream/10037/7563/1/article.pdf https://munin.uit.no/bitstream/10037/7563/1/article.pdf]
 
* [https://munin.uit.no/bitstream/10037/7563/1/article.pdf https://munin.uit.no/bitstream/10037/7563/1/article.pdf]
 +
 +
* [http://link.springer.com/content/pdf/10.1007/978-3-642-54420-0_31 http://link.springer.com/content/pdf/10.1007/978-3-642-54420-0_31],
 +
: [http://dx.doi.org/10.1007/978-3-642-54420-0_31 http://dx.doi.org/10.1007/978-3-642-54420-0_31] under the license http://www.springer.com/tdm
 +
 +
* [https://link.springer.com/chapter/10.1007/978-3-642-54420-0_31 https://link.springer.com/chapter/10.1007/978-3-642-54420-0_31],
 +
: [https://core.ac.uk/display/43615833 https://core.ac.uk/display/43615833],
 +
: [https://dblp.uni-trier.de/db/conf/europar/europar2013w.html#PedersenWB13 https://dblp.uni-trier.de/db/conf/europar/europar2013w.html#PedersenWB13],
 +
: [https://www.scipedia.com/public/Pedersen_et_al_2014a https://www.scipedia.com/public/Pedersen_et_al_2014a],
 +
: [https://munin.uit.no/handle/10037/7563 https://munin.uit.no/handle/10037/7563],
 +
: [https://munin.uit.no/bitstream/handle/10037/7563/article.pdf?sequence=1&isAllowed=y https://munin.uit.no/bitstream/handle/10037/7563/article.pdf?sequence=1&isAllowed=y],
 +
: [https://munin.uit.no/bitstream/10037/7563/1/article.pdf https://munin.uit.no/bitstream/10037/7563/1/article.pdf],
 +
: [http://www.ub.uit.no/munin/handle/10037/6902 http://www.ub.uit.no/munin/handle/10037/6902],
 +
: [https://academic.microsoft.com/#/detail/1839470988 https://academic.microsoft.com/#/detail/1839470988]

Latest revision as of 17:16, 21 January 2021

Abstract

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-54420-0_31 A large up-to-date compendium of integrated genomic data is often required for biological data analysis. The compendium can be tens of terabytes in size, and must often be frequently updated with new experimental or meta-data. Manual compendium update is cumbersome, requires a lot of unnecessary computation, and it may result in errors or inconsistencies in the compendium. We propose a transparent file based approach for adding incremental update ca-pabilities to unmodified genomics data analysis tools and pipeline workflow managers. This approach is implemented in the GeStore system. We evaluate GeStore using a real world genomics compendium. Our results show that it is easy to add incremental updates to genomics data processing pipelines, and that incremental updates can reduce the computation time such that it becomes prac-tical to maintain large-scale up-to-date genomics compendia on small clusters.


Original document

The different versions of the original document can be found in:

http://dx.doi.org/10.1007/978-3-642-54420-0_31 under the license http://www.springer.com/tdm
https://core.ac.uk/display/43615833,
https://dblp.uni-trier.de/db/conf/europar/europar2013w.html#PedersenWB13,
https://www.scipedia.com/public/Pedersen_et_al_2014a,
https://munin.uit.no/handle/10037/7563,
https://munin.uit.no/bitstream/handle/10037/7563/article.pdf?sequence=1&isAllowed=y,
https://munin.uit.no/bitstream/10037/7563/1/article.pdf,
http://www.ub.uit.no/munin/handle/10037/6902,
https://academic.microsoft.com/#/detail/1839470988
Back to Top

Document information

Published on 01/01/2014

Volume 2014, 2014
DOI: 10.1007/978-3-642-54420-0_31
Licence: CC BY-NC-SA license

Document Score

0

Views 0
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?