The design process for Google's training chips: TPUv2 and TPUv3

J Yao, S Zhang, Y Yao, F Wang, J Ma… - … on Knowledge and …, 2022 - ieeexplore.ieee.org

Influenced by the great success of deep learning via cloud computing and the rapid
development of edge chips, research in artificial intelligence (AI) has shifted to both of the …

被引用次数：83 相关文章所有 5 个版本

[PDF] cam.ac.uk

Materials and devices as solutions to computational problems in machine learning

NJ Tye, S Hofmann, P Stanley-Marbell - Nature Electronics, 2023 - nature.com

The growth of machine learning, combined with the approaching limits of conventional
digital computing, are driving a search for alternative and complementary forms of …

被引用次数：11 相关文章所有 2 个版本

[PDF] nature.com

Solving olympiad geometry without human demonstrations

TH Trinh, Y Wu, QV Le, H He, T Luong - Nature, 2024 - nature.com

Proving mathematical theorems at the olympiad level represents a notable milestone in
human-level automated reasoning,,–, owing to their reputed difficulty among the world's best …

被引用次数：173 相关文章所有 12 个版本

[PDF] acm.org

Tpu v4: An optically reconfigurable supercomputer for machine learning with hardware support for embeddings

N Jouppi, G Kurian, S Li, P Ma, R Nagarajan… - Proceedings of the 50th …, 2023 - dl.acm.org

In response to innovations in machine learning (ML) models, production workloads changed
radically and rapidly. TPU v4 is the fifth Google domain specific architecture (DSA) and its …

被引用次数：213 相关文章所有 6 个版本

[PDF] cu.edu.eg

Ten lessons from three generations shaped google's tpuv4i: Industrial product

NP Jouppi, DH Yoon, M Ashcraft… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org

Google deployed several TPU generations since 2015, teaching us lessons that changed
our views: semi-conductor technology advances unequally; compiler compatibility trumps …

被引用次数：345 相关文章所有 6 个版本

[PDF] neurips.cc

Revisiting resnets: Improved training and scaling strategies

I Bello, W Fedus, X Du, ED Cubuk… - Advances in …, 2021 - proceedings.neurips.cc

Novel computer vision architectures monopolize the spotlight, but the impact of the model
architecture is often conflated with simultaneous changes to training methodology and …

被引用次数：332 相关文章所有 7 个版本

[PDF] cambridge.org

Mixed precision algorithms in numerical linear algebra

NJ Higham, T Mary - Acta Numerica, 2022 - cambridge.org

Today's floating-point arithmetic landscape is broader than ever. While scientific computing
has traditionally used single precision and double precision floating-point arithmetics, half …

被引用次数：92 相关文章所有 17 个版本

[PDF] arxiv.org

Griffin: Mixing gated linear recurrences with local attention for efficient language models

S De, SL Smith, A Fernando, A Botev… - arXiv preprint arXiv …, 2024 - arxiv.org

Recurrent neural networks (RNNs) have fast inference and scale efficiently on long
sequences, but they are difficult to train and hard to scale. We propose Hawk, an RNN with …

被引用次数：40 相关文章所有 2 个版本

[PDF] acm.org

Overlap communication with dependent computation via decomposition in large deep learning models

S Wang, J Wei, A Sabne, A Davis, B Ilbeyi… - Proceedings of the 28th …, 2022 - dl.acm.org

Large deep learning models have shown great potential with state-of-the-art results in many
tasks. However, running these large models is quite challenging on an accelerator (GPU or …

被引用次数：42 相关文章所有 2 个版本

[PDF] mlsys.org

Randomness in neural network training: Characterizing the impact of tooling

D Zhuang, X Zhang, S Song… - Proceedings of Machine …, 2022 - proceedings.mlsys.org

The quest for determinism in machine learning has disproportionately focused on
characterizing the impact of noise introduced by algorithmic design choices. In this work, we …

被引用次数：75 相关文章所有 6 个版本

高级搜索

QQ 群