查看文章

anl.gov 中的 [PDF]

Spark-diy: A framework for interoperable spark operations with high performance block-based data models

作者

Silvina Caíno-Lores, Jesús Carretero, Bogdan Nicolae, Orcun Yildiz, Tom Peterka

发表日期

2018/12/17

研讨会论文

2018 IEEE/ACM 5th International Conference on Big Data Computing Applications and Technologies (BDCAT)

页码范围

1-10

出版商

IEEE

简介

Today's scientific applications are increasingly relying on a variety of data sources, storage facilities, and computing infrastructures, and there is a growing demand for data analysis and visualization for these applications. In this context, exploiting Big Data frameworks for scientific computing is an opportunity to incorporate high-level libraries, platforms, and algorithms for machine learning, graph processing, and streaming; inherit their data awareness and fault-tolerance; and increase productivity. Nevertheless, limitations exist when Big Data platforms are integrated with an HPC environment, namely poor scalability, severe memory overhead, and huge development effort. This paper focuses on a popular Big Data framework -Apache Spark- and proposes an architecture to support the integration of highly scalable MPI block-based data models and communication patterns with a map-reduce-based programming …

引用总数

被引用次数：19

20192020202120229 1 4 5

学术搜索中的文章

Spark-diy: A framework for interoperable spark operations with high performance block-based data models

S Caíno-Lores, J Carretero, B Nicolae, O Yildiz… - 2018 IEEE/ACM 5th International Conference on Big …, 2018

被引用次数：19 相关文章所有 4 个版本