作者
Silvina Caíno-Lores, Andrei Lapin, Peter Kropf, Jesús Carretero
发表日期
2016
研讨会论文
11th Workshop on Workflows in Support of Large-Scale Science (WORKS 2016)
简介
The increasing amount of data related to the execution of scientific workflows has raised awareness of their shift towards parallel data-intensive problems. In this paper, we deliver our experience with combining the traditional highperformance computing and grid-based approaches for scientific workflows, with Big Data analytics paradigms. Our goal was to assess and discuss the suitability of such dataintensive-oriented mechanisms for production-ready workflows, especially in terms of scalability, focusing on a key element in the Big Data ecosystem: the data-centric programming model. Hence, we reproduced the functionality of a MPI-based iterative workflow from the hydrology domain, EnKF-HGS, using the Spark data analysis framework. We conducted experiments on a local cluster, and we relied on our results to discuss promising directions for further research.
学术搜索中的文章
S Caíno-Lores, A Lapin, PG Kropf, J Carretero - WORKS@ SC, 2016