[HTML][HTML] Small files' problem in Hadoop: A systematic literature review

R Aggarwal, J Verma, M Siwach - … of King Saud University-Computer and …, 2022 - Elsevier
Apache Hadoop is an open-source software library which integrates a wide variety of
software tools and utilities to facilitate the distributed batch processing of big data sets …

[HTML][HTML] Quality assurance technologies of big data applications: A systematic literature review

S Ji, Q Li, W Cao, P Zhang, H Muccini - Applied Sciences, 2020 - mdpi.com
Big data applications are currently used in many application domains, ranging from
statistical applications to prediction systems and smart cities. However, the quality of these …

A new secure model for data protection over cloud computing

AM Sauber, PM El-Kafrawy, AF Shawish… - Computational …, 2021 - Wiley Online Library
The main goal of any data storage model on the cloud is accessing data in an easy way
without risking its security. A security consideration is a major aspect in any cloud data …

Small sized file storage problems in hadoop distributed file system

N Alange, A Mathur - 2019 international conference on smart …, 2019 - ieeexplore.ieee.org
Hadoop Distributed File System (HDFS) is widely used to store the files, which are having
heavy size. HDFS is so-called as distributed file system, which intends to store and access …

A small file merging strategy for spatiotemporal data in smart health

L Xiong, Y Zhong, X Liu, L Yang - IEEE Access, 2019 - ieeexplore.ieee.org
With the rapid development of smart health, the health sensors and wearable devices bring
huge amounts of small files of spatiotemporal data, which are distributed in different servers …

A novel hadoop security model for addressing malicious collusive workers

AM Sauber, A Awad, AF Shawish… - Computational …, 2021 - Wiley Online Library
With the daily increase of data production and collection, Hadoop is a platform for
processing big data on a distributed system. A master node globally manages running jobs …

A review of various optimization schemes of small files storage on Hadoop

L Huang, J Liu, W Meng - 2018 37th Chinese Control …, 2018 - ieeexplore.ieee.org
With the rapid development of electronic information industry and the increasing amount of
Internet traffic, the amount of data generated grows exponentially. Such a large scale of data …

[PDF][PDF] The main characteristics of five distributed file systems required for big data: A comparative study

A Elomari, L Hassouni, A Maizate - Adv. Sci. Technol. Eng. Syst, 2017 - researchgate.net
These last years, the amount of data generated by information systems has exploded. It is
not only the quantities of information that are now estimated in Exabyte, but also the variety …

Quality assurance technologies of big data applications: A systematic literature review

P Zhang, W Cao, H Muccini - arXiv preprint arXiv:2002.01759, 2020 - arxiv.org
Big data applications are currently used in many application domains, ranging from
statistical applications to prediction systems and smart cities. However, the quality of these …

HDFSx: an enhanced model to handle small files in Hadoop with a simulating toolkit

PM El Kafrawy, AM Sauber, MM Hafez… - 2018 1st International …, 2018 - ieeexplore.ieee.org
We live in Big Data era, where all data about our lives is captured, stored, processed and
used to change the world around us. This data is generated by different sources such as …