Map-join-reduce: Toward scalable and efficient data analysis on large clusters

S Sakr, A Liu, AG Fayoumi - ACM Computing Surveys (CSUR), 2013 - dl.acm.org

In the last two decades, the continuous increase of computational power has produced an
overwhelming flow of data which has called for a paradigm shift in the computing …

被引用次数：256 相关文章所有 9 个版本

[PDF] ttsell.ir

Distributed data management using MapReduce

F Li, BC Ooi, MT Özsu, S Wu - ACM Computing Surveys (CSUR), 2014 - dl.acm.org

MapReduce is a framework for processing and managing large-scale datasets in a
distributed cluster, which has been used for applications such as generating search indexes …

被引用次数：247 相关文章所有 15 个版本

[PDF] unibo.it

Data-intensive applications, challenges, techniques and technologies: A survey on Big Data

CLP Chen, CY Zhang - Information sciences, 2014 - Elsevier

It is already true that Big Data has drawn huge attention from researchers in information
sciences, policy and decision makers in governments and enterprises. As the speed of …

被引用次数：3683 相关文章所有 13 个版本

[PDF] snu.ac.kr

Parallel data processing with MapReduce: a survey

KH Lee, YJ Lee, H Choi, YD Chung, B Moon - AcM sIGMoD record, 2012 - dl.acm.org

A prominent parallel data processing tool MapReduce is gaining significant momentum from
both industry and academia as the volume of data to analyze grows rapidly. While …

被引用次数：944 相关文章所有 27 个版本

[PDF] psu.edu

Big data processing in cloud computing environments

C Ji, Y Li, W Qiu, U Awada, K Li - 2012 12th international …, 2012 - ieeexplore.ieee.org

With the rapid growth of emerging applications like social network analysis, semantic Web
analysis and bioinformatics network analysis, a variety of data to be processed continues to …

被引用次数：466 相关文章所有 8 个版本

[PDF] princeton.edu

A survey of large-scale analytical query processing in MapReduce

C Doulkeridis, K Nørvåg - The VLDB journal, 2014 - Springer

Enterprises today acquire vast volumes of data from different sources and leverage this
information by means of data analysis to support effective decision-making and provide new …

被引用次数：339 相关文章所有 15 个版本

[PDF] psu.edu

Llama: leveraging columnar storage for scalable join processing in the mapreduce framework

Y Lin, D Agrawal, C Chen, BC Ooi, S Wu - Proceedings of the 2011 ACM …, 2011 - dl.acm.org

To achieve high reliability and scalability, most large-scale data warehouse systems have
adopted the cluster-based architecture. In this paper, we propose the design of a new cluster …

被引用次数：187 相关文章所有 4 个版本

[PDF] sciencedirect.com

Unstructured data analysis on big data using map reduce

V Subramaniyaswamy, V Vijayakumar… - Procedia Computer …, 2015 - Elsevier

In the real time scenario, the volume of data used linearly increases with time. Social
networking sites like Facebook, Twitter discovered the growth of data which will be …

被引用次数：108 相关文章所有 5 个版本

Performance evaluation of K-means clustering on Hadoop infrastructure

S Vats, BB Sagar - Journal of Discrete Mathematical Sciences and …, 2019 - Taylor & Francis

Today we are living with the extensive volume of information which is developing at a very
fast pace. Clustering is the process of group similar kinds of information. The serial k-means …

被引用次数：51 相关文章

MapReduce parallel programming model: a state-of-the-art survey

R Li, H Hu, H Li, Y Wu, J Yang - International Journal of Parallel …, 2016 - Springer

With the development of information technologies, we have entered the era of Big Data.
Google's MapReduce programming model and its open-source implementation in Apache …

被引用次数：91 相关文章所有 5 个版本

高级搜索

QQ 群