The hadoop distributed file system

K Shvachko, H Kuang, S Radia… - 2010 IEEE 26th …, 2010 - ieeexplore.ieee.org
The Hadoop Distributed File System (HDFS) is designed to store very large data sets
reliably, and to stream those data sets at high bandwidth to user applications. In a large …

[图书][B] Data-intensive text processing with MapReduce

J Lin, C Dyer - 2022 - books.google.com
Our world is being revolutionized by data-driven methods: access to large amounts of data
has generated new insights and opened exciting new opportunities in commerce, science …

Improving mapreduce performance through data placement in heterogeneous hadoop clusters

J Xie, S Yin, X Ruan, Z Ding, Y Tian… - … on parallel & …, 2010 - ieeexplore.ieee.org
MapReduce has become an important distributed processing model for large-scale data-
intensive applications like data mining and web indexing. Hadoop-an open-source …

The hadoop distributed filesystem: Balancing portability and performance

J Shafer, S Rixner, AL Cox - 2010 IEEE International …, 2010 - ieeexplore.ieee.org
Hadoop is a popular open-source implementation of MapReduce for the analysis of large
datasets. To manage storage resources across the cluster, Hadoop uses a distributed user …

[PDF][PDF] A comprehensive survey for hadoop distributed file system

KJ Merceedi, NA Sabry - Asian Journal of Research in Computer …, 2021 - academia.edu
In the last few days, data and the internet have become increasingly growing, occurring in
big data. For these problems, there are many software frameworks used to increase the …

A novel approach to improving the efficiency of storing and accessing small files on hadoop: a case study by powerpoint files

B Dong, J Qiu, Q Zheng, X Zhong… - 2010 IEEE International …, 2010 - ieeexplore.ieee.org
Hadoop distributed file system (HDFS) becomes a representative cloud storage platform,
benefiting from its reliable, scalable and low-cost storage capability. HDFS has been utilized …

An optimized approach for storing and accessing small files on cloud storage

B Dong, Q Zheng, F Tian, KM Chao, R Ma… - Journal of Network and …, 2012 - Elsevier
Hadoop distributed file system (HDFS) is widely adopted to support Internet services.
Unfortunately, native HDFS does not perform well for large numbers but small size files …

MapReduce: simplified data analysis of big data

S Maitrey, CK Jha - Procedia Computer Science, 2015 - Elsevier
With the development of computer technology, there is a tremendous increase in the growth
of data. Scientists are overwhelmed with this increasing amount of data processing needs …

Analytical review on Hadoop Distributed file system

K Dwivedi, SK Dubey - 2014 5th International Conference …, 2014 - ieeexplore.ieee.org
Hadoop Distributed file System is used for processing, storing and analyzing very large
amount of unstructured data. It stores the data reliably and provides fault tolerance, fast and …

On the duality of data-intensive file system design: reconciling HDFS and PVFS

W Tantisiriroj, SW Son, S Patil, SJ Lang… - Proceedings of 2011 …, 2011 - dl.acm.org
Data-intensive applications fall into two computing styles: Internet services (cloud
computing) or high-performance computing (HPC). In both categories, the underlying file …