相关文章- 学术资源搜索

Clash of the titans: Mapreduce vs. spark for large scale data analytics

J Shi, Y Qiu, UF Minhas, L Jiao, C Wang… - Proceedings of the …, 2015 - dl.acm.org

MapReduce and Spark are two very popular open source cluster computing frameworks for
large scale data analytics. These frameworks hide the complexity of task parallelism and …

被引用次数：319 相关文章所有 11 个版本

Large scale distributed data science using apache spark

JG Shanahan, L Dai - Proceedings of the 21th ACM SIGKDD …, 2015 - dl.acm.org

Apache Spark is an open-source cluster computing framework for big data processing. It has
emerged as the next generation big data processing engine, overtaking Hadoop …

被引用次数：235 相关文章所有 2 个版本

[PDF] nus.edu.sg

Map-join-reduce: Toward scalable and efficient data analysis on large clusters

D Jiang, AKH Tung, G Chen - IEEE transactions on knowledge …, 2010 - ieeexplore.ieee.org

Data analysis is an important functionality in cloud computing which allows a huge amount
of data to be processed over very large clusters. MapReduce is recognized as a popular …

被引用次数：241 相关文章所有 12 个版本

[PDF] escholarship.org

[图书][B] An architecture for fast and general data processing on large clusters

M Zaharia - 2016 - books.google.com

The past few years have seen a major change in computing systems, as growing data
volumes and stalling processor speeds require more and more applications to scale out to …

被引用次数：178 相关文章所有 13 个版本

[PDF] psu.edu

Exploring mapreduce efficiency with highly-distributed data

M Cardosa, C Wang, A Nangia, A Chandra… - Proceedings of the …, 2011 - dl.acm.org

MapReduce is a highly-popular paradigm for high-performance computing over large data
sets in large-scale platforms. However, when the source data is widely distributed and the …

被引用次数：103 相关文章所有 9 个版本

[PDF] acm.org

MapReduce: simplified data processing on large clusters

J Dean, S Ghemawat - Communications of the ACM, 2008 - dl.acm.org

MapReduce is a programming model and an associated implementation for processing and
generating large datasets that is amenable to a broad variety of real-world tasks. Users …

被引用次数：23545 相关文章所有 86 个版本

[PDF] arxiv.org

The family of mapreduce and large-scale data processing systems

S Sakr, A Liu, AG Fayoumi - ACM Computing Surveys (CSUR), 2013 - dl.acm.org

In the last two decades, the continuous increase of computational power has produced an
overwhelming flow of data which has called for a paradigm shift in the computing …

被引用次数：260 相关文章所有 9 个版本

[PDF] alexdelis.eu

Parallel data processing with MapReduce: a survey

KH Lee, YJ Lee, H Choi, YD Chung, B Moon - AcM sIGMoD record, 2012 - dl.acm.org

A prominent parallel data processing tool MapReduce is gaining significant momentum from
both industry and academia as the volume of data to analyze grows rapidly. While …

被引用次数：955 相关文章所有 27 个版本

[PDF] escholarship.org

Themis: an i/o-efficient mapreduce

A Rasmussen, VT Lam, M Conley, G Porter… - Proceedings of the …, 2012 - dl.acm.org

" Big Data" computing increasingly utilizes the MapReduce programming model for scalable
processing of large data collections. Many MapReduce jobs are I/O-bound, and so …

被引用次数：130 相关文章所有 14 个版本

[PDF] usenix.org

[PDF][PDF] Spark: Cluster computing with working sets

M Zaharia, M Chowdhury, MJ Franklin… - 2nd USENIX workshop …, 2010 - usenix.org

MapReduce and its variants have been highly successful in implementing large-scale data-
intensive applications on commodity clusters. However, most of these systems are built …

被引用次数：7606 相关文章所有 54 个版本

高级搜索

QQ 群