相关文章- 学术资源搜索

Optimizing the Training of Co-Located Deep Learning Models Using Cache-Aware Staggering

K Assogba, B Nicolae… - 2023 IEEE 30th …, 2023 - ieeexplore.ieee.org

Despite significant advances, training deep learning models remains a time-consuming and
resource-intensive task. One of the key challenges in this context is the ingestion of the …

TensorSocket: Shared Data Loading for Deep Learning Training

T Robroek, NK Nielsen, P Tözün - arXiv preprint arXiv:2409.18749, 2024 - arxiv.org

Training deep learning models is a repetitive and resource-intensive process. Data
scientists often train several models before landing on set of parameters (eg, hyper …

[PDF] acm.org

EdgeServe: Efficient deep learning model caching at the edge

T Guo, RJ Walls, SS Ogden - Proceedings of the 4th ACM/IEEE …, 2019 - dl.acm.org

In this work, we look at how to effectively manage and utilize deep learning models at each
edge location, to provide performance guarantees to inference requests. We identify …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Accelerating deep learning inference via learned caches

A Balasubramanian, A Kumar, Y Liu, H Cao… - arXiv preprint arXiv …, 2021 - arxiv.org

Deep Neural Networks (DNNs) are witnessing increased adoption in multiple domains
owing to their high accuracy in solving real-world problems. However, this high accuracy …

被引用次数：15 相关文章所有 3 个版本

[PDF] github.io

[PDF][PDF] Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads (Information System Architectures)

K Nagrecha, A Kumar - 2023 - adalabucsd.github.io

Large models such as GPT-3 and ChatGPT have transformed deep learning (DL), powering
applications that have captured the public's imagination. Such models must be trained on …

被引用次数：1 相关文章

[PDF] arxiv.org

Hoard: A distributed data caching system to accelerate deep learning training on the cloud

C Pinto, Y Gkoufas, A Reale, S Seelam… - arXiv preprint arXiv …, 2018 - arxiv.org

Deep Learning system architects strive to design a balanced system where the
computational accelerator--FPGA, GPU, etc, is not starved for data. Feeding training data …

被引用次数：19 相关文章所有 3 个版本

[PDF] arxiv.org

Intermediate data caching optimization for multi-stage and parallel big data frameworks

Z Yang, D Jia, S Ioannidis, N Mi… - 2018 IEEE 11th …, 2018 - ieeexplore.ieee.org

In the era of big data and cloud computing, large amounts of data are generated from user
applications and need to be processed in the datacenter. Data-parallel computing …

被引用次数：45 相关文章所有 15 个版本

[PDF] arxiv.org

Partitioned Neural Network Training via Synthetic Intermediate Labels

CV Karadağ, N Topaloğlu - arXiv preprint arXiv:2403.11204, 2024 - arxiv.org

The proliferation of extensive neural network architectures, particularly deep learning
models, presents a challenge in terms of resource-intensive training. GPU memory …

Enabling efficient large-scale deep learning training with cache coherent disaggregated memory systems

Z Wang, J Sim, E Lim, J Zhao - 2022 IEEE International …, 2022 - ieeexplore.ieee.org

Modern deep learning (DL) training is memory-consuming, constrained by the memory
capacity of each computation component and cross-device communication bandwidth. In …

被引用次数：11 相关文章所有 5 个版本

COS: Cross-Processor Operator Scheduling for Multi-Tenant Deep Learning Inference

C Lin, J Liu - 2024 IEEE/ACM 32nd International Symposium on …, 2024 - ieeexplore.ieee.org

Multi-tenant inference, as a prevalent inference paradigm nowadays, requires deploying
multiple deep learning models on the hardware platform to concurrently process inference …

高级搜索

QQ 群

Optimizing the Training of Co-Located Deep Learning Models Using Cache-Aware Staggering

TensorSocket: Shared Data Loading for Deep Learning Training

EdgeServe: Efficient deep learning model caching at the edge

Accelerating deep learning inference via learned caches

[PDF][PDF] Saturn: An Optimized Data System for Multi-Large-Model Deep Learning Workloads (Information System Architectures)

Hoard: A distributed data caching system to accelerate deep learning training on the cloud

Intermediate data caching optimization for multi-stage and parallel big data frameworks

Partitioned Neural Network Training via Synthetic Intermediate Labels

Enabling efficient large-scale deep learning training with cache coherent disaggregated memory systems

COS: Cross-Processor Operator Scheduling for Multi-Tenant Deep Learning Inference

引用