[HTML][HTML] A review on big data based parallel and distributed approaches of pattern mining

S Kumar, KK Mohbey - Journal of King Saud University-Computer and …, 2022 - Elsevier
Pattern mining is a fundamental technique of data mining to discover interesting correlations
in the data set. There are several variations of pattern mining, such as frequent itemset …

Energy-efficient hadoop for big data analytics and computing: A systematic review and research insights

WT Wu, WW Lin, CH Hsu, LG He - Future Generation Computer Systems, 2018 - Elsevier
As the demands for big data analytics keep growing rapidly in scientific applications and
online services, MapReduce and its open-source implementation Hadoop gained popularity …

PHDFS: Optimizing I/O performance of HDFS in deep learning cloud computing platform

Z Zhu, L Tan, Y Li, C Ji - Journal of Systems Architecture, 2020 - Elsevier
For deep learning cloud computing platforms, file system is a fundamental and critical
component. Hadoop distributed file system (HDFS) is widely used in large scale clusters due …

Aqwa: adaptive query workload aware partitioning of big spatial data

AM Aly, AR Mahmood, MS Hassan, WG Aref… - Proceedings of the …, 2015 - dl.acm.org
The unprecedented spread of location-aware devices has resulted in a plethora of location-
based services in which huge amounts of spatial data need to be efficiently processed by …

IoT in smart farming analytics, big data based architecture

EM Ouafiq, A Elrharras, A Mehdary, A Chehri… - … Systems: Proceedings of …, 2021 - Springer
Abstract The concern over Smart Farming is growing, where Internet of Things (IoT)
technologies are highlighted in the farm management cycle. Also a large amount of data is …

Dealing with small files problem in hadoop distributed file system

S Bende, R Shedge - Procedia Computer Science, 2016 - Elsevier
The usage of Hadoop has been increasing greatly in recent years. Hadoop adoption is
widespread. Some notable big users such as Yahoo, Facebook, Netflix, and Amazon use …

[PDF][PDF] A comprehensive survey for hadoop distributed file system

KJ Merceedi, NA Sabry - Asian Journal of Research in Computer …, 2021 - academia.edu
In the last few days, data and the internet have become increasingly growing, occurring in
big data. For these problems, there are many software frameworks used to increase the …

Improving Hadoop MapReduce performance with data compression: A study using wordcount job

K Rattanaopas, S Kaewkeeree - 2017 14th International …, 2017 - ieeexplore.ieee.org
Hadoop cluster is widely used for executing and analyzing a large data like big data. It has
MapReduce engine for distributing data to each node in cluster. Compression is a benefit …

[HTML][HTML] Small files' problem in Hadoop: A systematic literature review

R Aggarwal, J Verma, M Siwach - … of King Saud University-Computer and …, 2022 - Elsevier
Apache Hadoop is an open-source software library which integrates a wide variety of
software tools and utilities to facilitate the distributed batch processing of big data sets …

Hadoop perfect file: A fast and memory-efficient metadata access archive file to face small files problem in hdfs

Y Zhai, J Tchaye-Kondi, KJ Lin, L Zhu, W Tao… - Journal of Parallel and …, 2021 - Elsevier
HDFS faces several issues when it comes to handling a large number of small files. These
issues are well addressed by archive systems, which combine small files into larger ones …