Anatomy of machine learning algorithm implementations in MPI, Spark, and Flink

S Kamburugamuve… - … Journal of High …, 2018 - journals.sagepub.com
With the ever-increasing need to analyze large amounts of data to get useful insights, it is
essential to develop complex parallel machine learning algorithms that can scale with data …

Smartbench: A benchmark for data management in smart spaces

P Gupta, MJ Carey, S Mehrotra… - Proceedings of the VLDB …, 2020 - dl.acm.org
This paper proposes SmartBench, a benchmark focusing on queries resulting from (near)
real-time applications and longer-term analysis of IoT data. SmartBench, derived from a …

Rose: Cluster resource scheduling via speculative over-subscription

X Sun, C Hu, R Yang, P Garraghan… - 2018 IEEE 38th …, 2018 - ieeexplore.ieee.org
A long-standing challenge in cluster scheduling is to achieve a high degree of utilization of
heterogeneous resources in a cluster. In practice there exists a substantial disparity between …

Speeding up SpMV for power-law graph analytics by enhancing locality & vectorization

S Yesil, A Heidarshenas, A Morrison… - … Conference for High …, 2020 - ieeexplore.ieee.org
Graph analytics applications often target large-scale web and social networks, which are
typically power-law graphs. Graph algorithms can often be recast as generalized Sparse …

Mammoths Are Slow: The Overlooked Transactions of Graph Data

A Cheng, J Waudby, H Firth, N Crooks… - Proceedings of the VLDB …, 2023 - dl.acm.org
This paper argues for better concurrency control to support mammoth transactions, which
read and write to many items. While these requests are prevalent on graph data, few …

Towards low-latency I/O services for mixed workloads using ultra-low latency SSDs

M Liu, H Liu, C Ye, X Liao, H Jin, Y Zhang… - Proceedings of the 36th …, 2022 - dl.acm.org
Low-latency I/O services are essential for latency-sensitive workloads when they co-run with
throughput-oriented workloads in cloud data centers. Although advanced SSDs such as Intel …

Evaluating interactive data systems: Survey and case studies

P Rahman, L Jiang, A Nandi - The VLDB Journal, 2020 - Springer
Interactive query interfaces have become a popular tool for ad hoc data analysis and
exploration. Compared with traditional systems that are optimized for throughput or batched …

MAS-Cloud+: A novel multi-agent architecture with reasoning models for resource management in multiple providers

AHD Mendes, MJF Rosa, MA Marotta, A Araujo… - Future Generation …, 2024 - Elsevier
Nowadays, scientific and commercial applications are often deployed to cloud environments
requiring multiple resource types. This scenario increases the necessity for efficient resource …

DV-DVFS: merging data variety and DVFS technique to manage the energy consumption of big data processing

H Ahmadvand, F Foroutan, M Fathy - Journal of Big Data, 2021 - Springer
Data variety is one of the most important features of Big Data. Data variety is the result of
aggregating data from multiple sources and uneven distribution of data. This feature of Big …

Understanding the behavior of in-memory computing workloads

T Jiang, Q Zhang, R Hou, L Chai… - 2014 IEEE …, 2014 - ieeexplore.ieee.org
The increasing demands of big data applications have led researchers and practitioners to
turn to in-memory computing to speed processing. For instance, the Apache Spark …