Edge-cloud polarization and collaboration: A comprehensive survey for ai

J Yao, S Zhang, Y Yao, F Wang, J Ma… - … on Knowledge and …, 2022 - ieeexplore.ieee.org
Influenced by the great success of deep learning via cloud computing and the rapid
development of edge chips, research in artificial intelligence (AI) has shifted to both of the …

Materials and devices as solutions to computational problems in machine learning

NJ Tye, S Hofmann, P Stanley-Marbell - Nature Electronics, 2023 - nature.com
The growth of machine learning, combined with the approaching limits of conventional
digital computing, are driving a search for alternative and complementary forms of …

Solving olympiad geometry without human demonstrations

TH Trinh, Y Wu, QV Le, H He, T Luong - Nature, 2024 - nature.com
Proving mathematical theorems at the olympiad level represents a notable milestone in
human-level automated reasoning,,–, owing to their reputed difficulty among the world's best …

Tpu v4: An optically reconfigurable supercomputer for machine learning with hardware support for embeddings

N Jouppi, G Kurian, S Li, P Ma, R Nagarajan… - Proceedings of the 50th …, 2023 - dl.acm.org
In response to innovations in machine learning (ML) models, production workloads changed
radically and rapidly. TPU v4 is the fifth Google domain specific architecture (DSA) and its …

Ten lessons from three generations shaped google's tpuv4i: Industrial product

NP Jouppi, DH Yoon, M Ashcraft… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
Google deployed several TPU generations since 2015, teaching us lessons that changed
our views: semi-conductor technology advances unequally; compiler compatibility trumps …

Revisiting resnets: Improved training and scaling strategies

I Bello, W Fedus, X Du, ED Cubuk… - Advances in …, 2021 - proceedings.neurips.cc
Novel computer vision architectures monopolize the spotlight, but the impact of the model
architecture is often conflated with simultaneous changes to training methodology and …

Mixed precision algorithms in numerical linear algebra

NJ Higham, T Mary - Acta Numerica, 2022 - cambridge.org
Today's floating-point arithmetic landscape is broader than ever. While scientific computing
has traditionally used single precision and double precision floating-point arithmetics, half …

Griffin: Mixing gated linear recurrences with local attention for efficient language models

S De, SL Smith, A Fernando, A Botev… - arXiv preprint arXiv …, 2024 - arxiv.org
Recurrent neural networks (RNNs) have fast inference and scale efficiently on long
sequences, but they are difficult to train and hard to scale. We propose Hawk, an RNN with …

Overlap communication with dependent computation via decomposition in large deep learning models

S Wang, J Wei, A Sabne, A Davis, B Ilbeyi… - Proceedings of the 28th …, 2022 - dl.acm.org
Large deep learning models have shown great potential with state-of-the-art results in many
tasks. However, running these large models is quite challenging on an accelerator (GPU or …

Randomness in neural network training: Characterizing the impact of tooling

D Zhuang, X Zhang, S Song… - Proceedings of Machine …, 2022 - proceedings.mlsys.org
The quest for determinism in machine learning has disproportionately focused on
characterizing the impact of noise introduced by algorithmic design choices. In this work, we …