作者
Silvina Caíno-Lores, Jesús Carretero, Bogdan Nicolae, Orcun Yildiz, Tom Peterka
发表日期
2018/12/17
研讨会论文
2018 IEEE/ACM 5th International Conference on Big Data Computing Applications and Technologies (BDCAT)
页码范围
1-10
出版商
IEEE
简介
Today's scientific applications are increasingly relying on a variety of data sources, storage facilities, and computing infrastructures, and there is a growing demand for data analysis and visualization for these applications. In this context, exploiting Big Data frameworks for scientific computing is an opportunity to incorporate high-level libraries, platforms, and algorithms for machine learning, graph processing, and streaming; inherit their data awareness and fault-tolerance; and increase productivity. Nevertheless, limitations exist when Big Data platforms are integrated with an HPC environment, namely poor scalability, severe memory overhead, and huge development effort. This paper focuses on a popular Big Data framework -Apache Spark- and proposes an architecture to support the integration of highly scalable MPI block-based data models and communication patterns with a map-reduce-based programming …
引用总数
20192020202120229145
学术搜索中的文章
S Caíno-Lores, J Carretero, B Nicolae, O Yildiz… - 2018 IEEE/ACM 5th International Conference on Big …, 2018