Tarema: Adaptive resource allocation for scalable scientific workflows in heterogeneous clusters

J Bader, L Thamsen, S Kulagina, J Will… - … Conference on Big …, 2021 - ieeexplore.ieee.org
Scientific workflow management systems like Nextflow support large-scale data analysis by
abstracting away the details of scientific workflows. In these systems, workflows consist of …

Lotaru: Locally estimating runtimes of scientific workflow tasks in heterogeneous clusters

J Bader, F Lehmann, L Thamsen, J Will… - Proceedings of the 34th …, 2022 - dl.acm.org
Many scientific workflow scheduling algorithms need to be informed about task runtimes a-
priori to conduct efficient scheduling. In heterogeneous cluster infrastructures, this problem …

Sweep: Accelerating scientific research through scalable serverless workflows

A John, K Ausmees, K Muenzen, C Kuhn… - Proceedings of the 12th …, 2019 - dl.acm.org
Scientific and commercial applications are increasingly being executed in the cloud, but the
difficulties associated with cluster management render on-demand resources inaccessible …

Privacy-preserving workflow scheduling in geo-distributed data centers

Y Xiao, AC Zhou, X Yang, B He - Future Generation Computer Systems, 2022 - Elsevier
Due to the increasing volume of data to be analyzed and the need for global collaborations,
many scientific applications have been deployed in a geo-distributed manner. Scientific …

BiobankCloud: a platform for the secure storage, sharing, and processing of large biomedical data sets

A Bessani, J Brandt, M Bux, V Cogo, L Dimitrova… - … Data Management and …, 2016 - Springer
Biobanks store and catalog human biological material that is increasingly being digitized
using next-generation sequencing (NGS). There is, however, a computational bottleneck, as …

Raw data queries during data-intensive parallel workflow execution

V Silva, J Leite, JJ Camata, D De Oliveira… - Future Generation …, 2017 - Elsevier
Computer simulations consume and produce huge amounts of raw data files presented in
different formats, eg, HDF5 in computational fluid dynamics simulations. Users often need to …

Applying big data paradigms to a large scale scientific workflow: Lessons learned and future directions

S Caíno-Lores, A Lapin, J Carretero, P Kropf - Future Generation Computer …, 2020 - Elsevier
The increasing amounts of data related to the execution of scientific workflows has raised
awareness of their shift towards parallel data-intensive problems. In this paper, we deliver …

Hopsworks: Improving user experience and development on hadoop with scalable, strongly consistent metadata

M Ismail, E Gebremeskel, T Kakantousis… - 2017 IEEE 37th …, 2017 - ieeexplore.ieee.org
Hadoop is a popular system for storing, managing, and processing large volumes of data,
but it has bare-bonesinternal support for metadata, as metadata is a bottleneck andless …

Feedback-based resource allocation for batch scheduling of scientific workflows

C Witt, D Wagner, U Leser - 2019 International Conference on …, 2019 - ieeexplore.ieee.org
A scientific workflow is a set of interdependent compute tasks orchestrating large scale data
analyses or in-silico experiments. Workflows often comprise thousands of tasks with …

Executing cyclic scientific workflows in the cloud

M Krämer, HM Würz, C Altenhofen - Journal of Cloud Computing, 2021 - Springer
We present an algorithm and a software architecture for a cloud-based system that executes
cyclic scientific workflows whose structure may change during run time. Existing approaches …