相关文章- 学术资源搜索

[图书][B] An architecture for fast and general data processing on large clusters

M Zaharia - 2016 - books.google.com

The past few years have seen a major change in computing systems, as growing data
volumes and stalling processor speeds require more and more applications to scale out to …

被引用次数：171 相关文章所有 13 个版本

[PDF] usenix.org

[PDF][PDF] Spark: Cluster computing with working sets

M Zaharia, M Chowdhury, MJ Franklin… - 2nd USENIX workshop …, 2010 - usenix.org

MapReduce and its variants have been highly successful in implementing large-scale data-
intensive applications on commodity clusters. However, most of these systems are built …

被引用次数：7349 相关文章所有 54 个版本

[PDF] researchgate.net

Clash of the titans: Mapreduce vs. spark for large scale data analytics

J Shi, Y Qiu, UF Minhas, L Jiao, C Wang… - Proceedings of the …, 2015 - dl.acm.org

MapReduce and Spark are two very popular open source cluster computing frameworks for
large scale data analytics. These frameworks hide the complexity of task parallelism and …

被引用次数：307 相关文章所有 11 个版本

Large scale distributed data science using apache spark

JG Shanahan, L Dai - Proceedings of the 21th ACM SIGKDD …, 2015 - dl.acm.org

Apache Spark is an open-source cluster computing framework for big data processing. It has
emerged as the next generation big data processing engine, overtaking Hadoop …

被引用次数：227 相关文章所有 2 个版本

[PDF] researchgate.net

SCOPE: parallel databases meet MapReduce

J Zhou, N Bruno, MC Wu, PA Larson, R Chaiken… - The VLDB Journal, 2012 - Springer

Companies providing cloud-scale data services have increasing needs to store and analyze
massive data sets, such as search logs, click streams, and web graph data. For cost and …

被引用次数：197 相关文章所有 14 个版本

[PDF] acm.org

MapReduce: simplified data processing on large clusters

J Dean, S Ghemawat - Communications of the ACM, 2008 - dl.acm.org

MapReduce is a programming model and an associated implementation for processing and
generating large datasets that is amenable to a broad variety of real-world tasks. Users …

被引用次数：22932 相关文章所有 86 个版本

Disco: a computing platform for large-scale data analytics

P Mundkur, V Tuulos, J Flatow - Proceedings of the 10th ACM SIGPLAN …, 2011 - dl.acm.org

We describe the design and implementation of Disco, a distributed computing platform for
MapReduce style computations on large-scale data. Disco is designed for operation in …

被引用次数：48 相关文章

[PDF] ttsell.ir

Distributed data management using MapReduce

F Li, BC Ooi, MT Özsu, S Wu - ACM Computing Surveys (CSUR), 2014 - dl.acm.org

MapReduce is a framework for processing and managing large-scale datasets in a
distributed cluster, which has been used for applications such as generating search indexes …

被引用次数：247 相关文章所有 15 个版本

[PDF] arxiv.org

Muppet: Mapreduce-style processing of fast data

W Lam, L Liu, STS Prasad, A Rajaraman… - arXiv preprint arXiv …, 2012 - arxiv.org

MapReduce has emerged as a popular method to process big data. In the past few years,
however, not just big data, but fast data has also exploded in volume and availability …

被引用次数：196 相关文章所有 16 个版本

[PDF] timkaldewey.de

Clydesdale: structured data processing on MapReduce

T Kaldewey, EJ Shekita, S Tata - … of the 15th international conference on …, 2012 - dl.acm.org

MapReduce has emerged as a promising architecture for large scale data analytics on
commodity clusters. The rapid adoption of Hive, a SQL-like data processing language on …

被引用次数：70 相关文章所有 14 个版本

高级搜索

QQ 群