[HTML][HTML] A review of industrial big data for decision making in intelligent manufacturing

C Li, Y Chen, Y Shang - … Science and Technology, an International Journal, 2022 - Elsevier
Under the trend of economic globalization, intelligent manufacturing has attracted a lot of
attention from academic and industry. Related enabling technologies make manufacturing …

Performance evaluation analysis of spark streaming backpressure for data-intensive pipelines

KJ Matteussi, JCS Dos Anjos, VRQ Leithardt… - Sensors, 2022 - mdpi.com
A significant rise in the adoption of streaming applications has changed the decision-making
processes in the last decade. This movement has led to the emergence of several Big Data …

Boosting big data streaming applications in clouds with BurstFlow

PRR De Souza, KJ Matteussi, ADS Veith… - IEEE …, 2020 - ieeexplore.ieee.org
The rapid growth of stream applications in financial markets, health care, education, social
media, and sensor networks represents a remarkable milestone for data processing and …

Distributed three-way formal concept analysis for large formal contexts

RK Chunduri, AK Cherukuri - Journal of Parallel and Distributed Computing, 2023 - Elsevier
Three-way concept analysis (3WCA) is a framework based on Formal concept analysis and
three-way decisions is used in the field of knowledge discovery to solve uncertainties in …

Parallel machine learning algorithm using fine-grained-mode spark on a mesos big data cloud computing software framework for mobile robotic intelligent fault …

G Xian - IEEE Access, 2020 - ieeexplore.ieee.org
An accurate and efficient intelligent fault diagnosis of mobile robotic roller bearings can
significantly enhance the reliability and safety of mechanical systems. To improve the …

Cost-aware scheduling and data skew alleviation for big data processing in heterogeneous cloud environment

H Li, L Zhu, S Wang, L Wang - Journal of Grid Computing, 2023 - Springer
For big data applications, it is important to allocate resources reasonably and schedule tasks
effectively. As one of the popular big data processing frameworks, the default scheduling …

[HTML][HTML] Dynamic data replacement and adaptive scheduling policies in spark

C Li, Q Cai, Y Luo - Cluster Computing, 2022 - Springer
Improper data replacement and inappropriate selection of job scheduling policy are
important reasons for the degradation of Spark system operation speed, which directly …

Energy-efficient scheduling algorithms based on task clustering in heterogeneous spark clusters

W Shi, H Li, J Guan, H Zeng - Parallel Computing, 2022 - Elsevier
Spark is widely used for its fast in-memory processing. It is important to improve energy
efficiency under deadline constrains. In this paper, a Task Performance Clustering of Best …

Optimization of the join between large tables in the spark distributed framework

X Wu, Y He - Applied Sciences, 2023 - mdpi.com
The Join task between Spark large tables takes a long time to run and produces a lot of disk
I/O, network I/O and disk occupation in the Shuffle process. This paper proposes a …

Pokémem: Taming wild memory consumers in apache spark

M Kweun, G Kim, B Oh, S Jung, T Um… - 2022 IEEE …, 2022 - ieeexplore.ieee.org
Apache Spark is a widely used in-memory processing system due to its high performance.
For fast data processing, Spark manages in-memory data such as cached or shuffling …