Data management in machine learning: Challenges, techniques, and systems

A Kumar, M Boehm, J Yang - Proceedings of the 2017 ACM International …, 2017 - dl.acm.org
Large-scale data analytics using statistical machine learning (ML), popularly called
advanced analytics, underpins many modern data-driven applications. The data …

MNN: A universal and efficient inference engine

X Jiang, H Wang, Y Chen, Z Wu… - Proceedings of …, 2020 - proceedings.mlsys.org
Deploying deep learning (DL) models on mobile devices draws more and more attention
recently. However, designing an efficient inference engine on devices is under the great …

Systemml: Declarative machine learning on spark

M Boehm, MW Dusenberry, D Eriksson… - Proceedings of the …, 2016 - dl.acm.org
The rising need for custom machine learning (ML) algorithms and the growing data sizes
that require the exploitation of distributed, data-parallel frameworks such as MapReduce or …

Apollo: Automatic partition-based operator fusion through layer by layer optimization

J Zhao, X Gao, R Xia, Z Zhang… - Proceedings of …, 2022 - proceedings.mlsys.org
We study fusion for deep neural networks (DNNs) in a just-in-time (JIT) compilation
framework Apollo. It considers both memory-and compute-bound tensor operators for fusion …

AStitch: enabling a new multi-dimensional optimization space for memory-intensive ML training and inference on modern SIMT architectures

Z Zheng, X Yang, P Zhao, G Long, K Zhu… - Proceedings of the 27th …, 2022 - dl.acm.org
This work reveals that memory-intensive computation is a rising performance-critical factor in
recent machine learning models. Due to a unique set of new challenges, existing ML …

Pump up the volume: Processing large data on gpus with fast interconnects

C Lutz, S Breß, S Zeuch, T Rabl, V Markl - Proceedings of the 2020 ACM …, 2020 - dl.acm.org
GPUs have long been discussed as accelerators for database query processing because of
their high processing power and memory bandwidth. However, two main challenges limit the …

Achieving on-mobile real-time super-resolution with neural architecture and pruning search

Z Zhan, Y Gong, P Zhao, G Yuan… - Proceedings of the …, 2021 - openaccess.thecvf.com
Though recent years have witnessed remarkable progress in single image super-resolution
(SISR) tasks with the prosperous development of deep neural networks (DNNs), the deep …

Chimera: An analytical optimizing framework for effective compute-intensive operators fusion

S Zheng, S Chen, P Song, R Chen, X Li… - … Symposium on High …, 2023 - ieeexplore.ieee.org
Machine learning models with various tensor operators are becoming ubiquitous in recent
years. There are two types of operators in machine learning: compute-intensive operators …

Gradient compression supercharged high-performance data parallel dnn training

Y Bai, C Li, Q Zhou, J Yi, P Gong, F Yan… - Proceedings of the …, 2021 - dl.acm.org
Gradient compression is a promising approach to alleviating the communication bottleneck
in data parallel deep neural network (DNN) training by significantly reducing the data …

Triton join: Efficiently scaling to a large join state on gpus with fast interconnects

C Lutz, S Breß, S Zeuch, T Rabl, V Markl - Proceedings of the 2022 …, 2022 - dl.acm.org
Database management systems are facing growing data volumes. Previous research
suggests that GPUs are well-equipped to quickly process joins and similar stateful …