作者
Silvina Caíno-Lores, Andrei Lapin, Peter Kropf, Jesús Carretero
发表日期
2016
研讨会论文
11th Workshop on Workflows in Support of Large-Scale Science (WORKS 2016)
简介
The increasing amount of data related to the execution of scientific workflows has raised awareness of their shift towards parallel data-intensive problems. In this paper, we deliver our experience with combining the traditional highperformance computing and grid-based approaches for scientific workflows, with Big Data analytics paradigms. Our goal was to assess and discuss the suitability of such dataintensive-oriented mechanisms for production-ready workflows, especially in terms of scalability, focusing on a key element in the Big Data ecosystem: the data-centric programming model. Hence, we reproduced the functionality of a MPI-based iterative workflow from the hydrology domain, EnKF-HGS, using the Spark data analysis framework. We conducted experiments on a local cluster, and we relied on our results to discuss promising directions for further research.
引用总数
20172018201920202323
学术搜索中的文章