BFloat16: The secret to high performance on Cloud TPUs

G Menghani - ACM Computing Surveys, 2023 - dl.acm.org

Deep learning has revolutionized the fields of computer vision, natural language
understanding, speech recognition, information retrieval, and more. However, with the …

被引用次数：419 相关文章所有 6 个版本

[PDF] iop.org Full View

Deep learning in electron microscopy

JM Ede - Machine Learning: Science and Technology, 2021 - iopscience.iop.org

Deep learning is transforming most areas of science and technology, including electron
microscopy. This review paper offers a practical perspective aimed at developers with …

被引用次数：111 相关文章所有 8 个版本

[PDF] jmlr.org

Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity

W Fedus, B Zoph, N Shazeer - Journal of Machine Learning Research, 2022 - jmlr.org

In deep learning, models typically reuse the same parameters for all inputs. Mixture of
Experts (MoE) models defy this and instead select different parameters for each incoming …

被引用次数：1951 相关文章所有 4 个版本

Hardware architecture and software stack for PIM based on commercial DRAM technology: Industrial product

S Lee, S Kang, J Lee, H Kim, E Lee… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

Emerging applications such as deep neural network demand high off-chip memory
bandwidth. However, under stringent physical constraints of chip packages and system …

被引用次数：225 相关文章所有 4 个版本

[PDF] nature.com

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

MJ Rasch, C Mackin, M Le Gallo, A Chen… - Nature …, 2023 - nature.com

Analog in-memory computing—a promising approach for energy-efficient acceleration of
deep learning workloads—computes matrix-vector multiplications but only approximately …

被引用次数：74 相关文章所有 12 个版本

[PDF] cambridge.org

Mixed precision algorithms in numerical linear algebra

NJ Higham, T Mary - Acta Numerica, 2022 - cambridge.org

Today's floating-point arithmetic landscape is broader than ever. While scientific computing
has traditionally used single precision and double precision floating-point arithmetics, half …

被引用次数：121 相关文章所有 17 个版本

[PDF] neurips.cc

Pushing the limits of narrow precision inferencing at cloud scale with microsoft floating point

B Darvish Rouhani, D Lo, R Zhao… - Advances in neural …, 2020 - proceedings.neurips.cc

In this paper, we explore the limits of Microsoft Floating Point (MSFP), a new class of
datatypes developed for production cloud-scale inferencing on custom hardware. Through …

被引用次数：131 相关文章所有 4 个版本

[PDF] arxiv.org

1.1 the deep learning revolution and its implications for computer architecture and chip design

J Dean - 2020 IEEE International Solid-State Circuits …, 2020 - ieeexplore.ieee.org

The past decade has seen a remarkable series of advances in machine learning, and in
particular deeplearning approaches based on artificial neural networks, to improve our …

被引用次数：127 相关文章所有 5 个版本

[HTML] sciencedirect.com

[HTML][HTML] Applications of deep learning in fish habitat monitoring: A tutorial and survey

A Saleh, M Sheaves, D Jerry, MR Azghadi - Expert Systems with …, 2024 - Elsevier

Marine ecosystems and their fish habitats are becoming increasingly important due to their
integral role in providing a valuable food source and conservation outcomes. Due to their …

被引用次数：38 相关文章所有 6 个版本

[PDF] arxiv.org

Tensordash: Exploiting sparsity to accelerate deep neural network training

M Mahmoud, I Edo, AH Zadeh… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org

TensorDash is a hardware-based technique that enables data-parallel MAC units to take
advantage of sparsity in their input operand streams. When used to compose a hardware …

被引用次数：89 相关文章所有 8 个版本

高级搜索

QQ 群