相关文章- 学术资源搜索

Scarlett: coping with skewed content popularity in mapreduce clusters

G Ananthanarayanan, S Agarwal, S Kandula… - Proceedings of the sixth …, 2011 - dl.acm.org

To improve data availability and resilience MapReduce frameworks use file systems that
replicate data uniformly. However, analysis of job logs from a large production cluster shows …

被引用次数：426 相关文章所有 13 个版本

[PDF] researchgate.net

MapReduce optimization using regulated dynamic prioritization

T Sandholm, K Lai - Proceedings of the eleventh international joint …, 2009 - dl.acm.org

We present a system for allocating resources in shared data and compute clusters that
improves MapReduce job scheduling in three ways. First, the system uses regulated and …

被引用次数：260 相关文章所有 5 个版本

[PDF] usenix.org

[PDF][PDF] Chukwa: a system for reliable {Large-Scale} log collection

A Rabkin, R Katz - 24th Large Installation System Administration …, 2010 - usenix.org

Large Internet services companies like Google, Yahoo, and Facebook use the MapReduce
programming model to process log data. MapReduce is designed to work on data stored in …

被引用次数：151 相关文章所有 9 个版本

[PDF] tju.edu.cn

Dynamicmr: A dynamic slot allocation optimization framework for mapreduce clusters

S Tang, BS Lee, B He - IEEE Transactions on Cloud …, 2014 - ieeexplore.ieee.org

MapReduce is a popular computing paradigm for large-scale data processing in cloud
computing. However, the slot-based MapReduce system (eg, Hadoop MRv1) can suffer from …

被引用次数：109 相关文章所有 7 个版本

[PDF] academia.edu

Aria: automatic resource inference and allocation for mapreduce environments

A Verma, L Cherkasova, RH Campbell - Proceedings of the 8th ACM …, 2011 - dl.acm.org

MapReduce and Hadoop represent an economically compelling alternative for efficient
large scale data processing and advanced analytics in the enterprise. A key challenge in …

被引用次数：610 相关文章所有 12 个版本

[PDF] psu.edu

Exploring mapreduce efficiency with highly-distributed data

M Cardosa, C Wang, A Nangia, A Chandra… - Proceedings of the …, 2011 - dl.acm.org

MapReduce is a highly-popular paradigm for high-performance computing over large data
sets in large-scale platforms. However, when the source data is widely distributed and the …

被引用次数：103 相关文章所有 9 个版本

[PDF] alekh.org

Trojan data layouts: right shoes for a running elephant

A Jindal, JA Quiané-Ruiz, J Dittrich - … of the 2nd ACM Symposium on …, 2011 - dl.acm.org

MapReduce is becoming ubiquitous in large-scale data analysis. Several recent works have
shown that the performance of Hadoop MapReduce could be improved, for instance, by …

被引用次数：148 相关文章所有 14 个版本

Sailfish: A framework for large scale data processing

S Rao, R Ramakrishnan, A Silberstein… - Proceedings of the …, 2012 - dl.acm.org

In this paper, we present Sailfish, a new Map-Reduce framework for large scale data
processing. The Sailfish design is centered around aggregating intermediate data …

被引用次数：147 相关文章

[PDF] zhenxiao.com

Improving MapReduce performance using smart speculative execution strategy

Q Chen, C Liu, Z Xiao - IEEE Transactions on Computers, 2013 - ieeexplore.ieee.org

MapReduce is a widely used parallel computing framework for large scale data processing.
The two major performance metrics in MapReduce are job execution time and cluster …

被引用次数：255 相关文章所有 8 个版本

[PDF] usenix.org

[PDF][PDF] Improving MapReduce performance in heterogeneous environments.

M Zaharia, A Konwinski, AD Joseph, RH Katz, I Stoica - Osdi, 2008 - usenix.org

MapReduce is emerging as an important programming model for large-scale data-parallel
applications such as web indexing, data mining, and scientific simulation. Hadoop is an …

被引用次数：2460 相关文章所有 34 个版本

高级搜索

QQ 群