相关文章- 学术资源搜索

{FlashNeuron}:{SSD-Enabled}{Large-Batch} Training of Very Deep Neural Networks

J Bae, J Lee, Y Jin, S Son, S Kim, H Jang… - … USENIX Conference on …, 2021 - usenix.org

Deep neural networks (DNNs) are widely used in various AI application domains such as
computer vision, natural language processing, autonomous driving, and bioinformatics. As …

被引用次数：39 相关文章所有 11 个版本

[PDF] usenix.org

Behemoth: a flash-centric training accelerator for extreme-scale {DNNs}

S Kim, Y Jin, G Sohn, J Bae, TJ Ham… - 19th USENIX Conference …, 2021 - usenix.org

The explosive expansion of Deep Neural Networks (DNN) model size expedites the need for
larger memory capacity. This movement is particularly true for models in natural language …

被引用次数：23 相关文章所有 7 个版本

[PDF] arxiv.org

Stannis: low-power acceleration of dnn training using computational storage devices

A HeydariGorji, M Torabzadehkashi… - 2020 57th ACM/IEEE …, 2020 - ieeexplore.ieee.org

Computational storage devices enable in-storage processing of data in place. These
devices contain 64-bit application processors and hardware accelerators that can help …

被引用次数：17 相关文章所有 4 个版本

[PDF] whiterose.ac.uk

Stronghold: fast and affordable billion-scale deep learning model training

X Sun, W Wang, S Qiu, R Yang… - … Conference for High …, 2022 - ieeexplore.ieee.org

Deep neural networks (DNNs) with billion-scale parameters have demonstrated impressive
performance in solving many tasks. Unfortunately, training a billion-scale DNN is out of the …

被引用次数：11 相关文章所有 7 个版本

[PDF] mlsys.org

Mini-batch serialization: Cnn training with inter-layer data reuse

S Lym, A Behroozi, W Wen, G Li… - … of Machine Learning …, 2019 - proceedings.mlsys.org

Training convolutional neural networks (CNNs) requires intense computations and high
memory bandwidth. We find that bandwidth today is over-provisioned because most memory …

被引用次数：27 相关文章所有 4 个版本

[PDF] arxiv.org

Harmony: Overcoming the hurdles of gpu memory capacity to train massive dnn models on commodity servers

Y Li, A Phanishayee, D Murray, J Tarnawski… - arXiv preprint arXiv …, 2022 - arxiv.org

Deep neural networks (DNNs) have grown exponentially in size over the past decade,
leaving only those who have massive datacenter-based resources with the ability to develop …

被引用次数：18 相关文章所有 7 个版本

[PDF] arxiv.org

GradPIM: A practical processing-in-DRAM architecture for gradient descent

H Kim, H Park, T Kim, K Cho, E Lee… - … Symposium on High …, 2021 - ieeexplore.ieee.org

In this paper, we present GradPIM, a processingin-memory architecture which accelerates
parameter updates of deep neural networks training. As one of processing-in-memory …

被引用次数：33 相关文章所有 8 个版本

Dynamic memory management for GPU-based training of deep neural networks

SB Shriram, A Garg, P Kulkarni - 2019 IEEE International …, 2019 - ieeexplore.ieee.org

Deep learning has been widely adopted for different applications of artificial intelligence-
speech recognition, natural language processing, computer vision etc. The growing size of …

被引用次数：47 相关文章所有 2 个版本

Fusing in-storage and near-storage acceleration of convolutional neural networks

I Okafor, AK Ramanathan, NR Challapalle, Z Li… - ACM Journal on …, 2023 - dl.acm.org

Video analytics has a wide range of applications and has attracted much interest over the
years. While it can be both computationally and energy-intensive, video analytics can greatly …

被引用次数：3 相关文章

[PDF] pasalabs.org

Enabling Large Dynamic Neural Network Training with Learning-based Memory Management

J Ren, D Xu, S Yang, J Zhao, Z Li… - … Symposium on High …, 2024 - ieeexplore.ieee.org

Dynamic neural network (DyNN) enables high computational efficiency and strong
representation capability. However, training DyNN can face a memory capacity problem …

被引用次数：2 相关文章

高级搜索

QQ 群

{FlashNeuron}:{SSD-Enabled}{Large-Batch} Training of Very Deep Neural Networks

Behemoth: a flash-centric training accelerator for extreme-scale {DNNs}

Stannis: low-power acceleration of dnn training using computational storage devices

Stronghold: fast and affordable billion-scale deep learning model training

Mini-batch serialization: Cnn training with inter-layer data reuse

Harmony: Overcoming the hurdles of gpu memory capacity to train massive dnn models on commodity servers

GradPIM: A practical processing-in-DRAM architecture for gradient descent

Dynamic memory management for GPU-based training of deep neural networks

Fusing in-storage and near-storage acceleration of convolutional neural networks

Enabling Large Dynamic Neural Network Training with Learning-based Memory Management

相关搜索

引用