[PDF][PDF] Software engineering for scientific big data analysis

BA Grüning, S Lampa, M Vaudel, D Blankenberg - GigaScience, 2019 - academic.oup.com
The increasing complexity of data and analysis methods has created an environment where
scientists, who may not have formal training, are finding themselves playing the impromptu …

[PDF][PDF] SciPipe: A workflow library for agile development of complex and dynamic bioinformatics pipelines

S Lampa, M Dahlö, J Alvarsson, O Spjuth - GigaScience, 2019 - academic.oup.com
Background The complex nature of biological data has driven the development of
specialized software tools. Scientific workflow management systems simplify the assembly of …

OpenAlea: scientific workflows combining data analysis and simulation

C Pradal, C Fournier, P Valduriez… - Proceedings of the 27th …, 2015 - dl.acm.org
Analyzing biological data (eg, annotating genomes, assembling NGS data...) may involve
very complex and interlinked steps where several tools are combined together. Scientific …

BiobankCloud: a platform for the secure storage, sharing, and processing of large biomedical data sets

A Bessani, J Brandt, M Bux, V Cogo, L Dimitrova… - … Data Management and …, 2016 - Springer
Biobanks store and catalog human biological material that is increasingly being digitized
using next-generation sequencing (NGS). There is, however, a computational bottleneck, as …

SAASFEE: scalable scientific workflow execution engine

M Bux, J Brandt, C Lipka, K Hakimzadeh… - Proceedings of the …, 2015 - dl.acm.org
Across many fields of science, primary data sets like sensor read-outs, time series, and
genomic sequences are analyzed by complex chains of specialized tools and scripts …

InfraPhenoGrid: a scientific workflow infrastructure for plant phenomics on the grid

C Pradal, S Artzet, J Chopard, D Dupuis… - Future Generation …, 2017 - Elsevier
Plant phenotyping consists in the observation of physical and biochemical traits of plant
genotypes in response to environmental conditions. Challenges, in particular in context of …

Unifying package managers, workflow engines, and containers: Computational reproducibility with BioNix

J Bedő, L Di Stefano, AT Papenfuss - GigaScience, 2020 - academic.oup.com
Motivation A challenge for computational biologists is to make our analyses reproducible—
ie to rerun, combine, and share, with the assurance that equivalent runs will generate …

Estimating genome-wide regulatory activity from multi-omics data sets using mathematical optimization

S Trescher, J Münchmeyer, U Leser - BMC systems biology, 2017 - Springer
Background Gene regulation is one of the most important cellular processes, indispensable
for the adaptability of organisms and closely interlinked with several classes of pathogenesis …

Computation semantics of the functional scientific workflow language Cuneiform

J Brandt, W Reisig, U Leser - Journal of Functional Programming, 2017 - cambridge.org
Cuneiform is a minimal functional programming language for large-scale scientific data
analysis. Implementing a strict black-box view on external operators and data, it allows the …

Portability of scientific workflows in ngs data analysis: a case study

C Schiefer, M Bux, J Brandt, C Messerschmidt… - arXiv preprint arXiv …, 2020 - arxiv.org
The analysis of next-generation sequencing (NGS) data requires complex computational
workflows consisting of dozens of autonomously developed yet interdependent processing …