Machine learning operations (mlops): Overview, definition, and architecture

D Kreuzberger, N Kühl, S Hirschl - IEEE access, 2023 - ieeexplore.ieee.org
The final goal of all industrial machine learning (ML) projects is to develop ML products and
rapidly bring them into production. However, it is highly challenging to automate and …

Aligning artificial intelligence with climate change mitigation

LH Kaack, PL Donti, E Strubell, G Kamiya… - Nature Climate …, 2022 - nature.com
There is great interest in how the growth of artificial intelligence and machine learning may
affect global GHG emissions. However, such emissions impacts remain uncertain, owing in …

Fast inference from transformers via speculative decoding

Y Leviathan, M Kalman… - … Conference on Machine …, 2023 - proceedings.mlr.press
Inference from large autoregressive models like Transformers is slow-decoding K tokens
takes K serial runs of the model. In this work we introduce speculative decoding-an …

[PDF][PDF] 2024 IEEE International Conference on Robotics and Automation (ICRA)

Y Peng, C Chen, G Huang - 2024 - par.nsf.gov
As edge devices equipped with cameras and inertial measurement units (IMUs) are
emerging, it holds huge implications to endow these mobile devices with spatial computing …

Quantizable transformers: Removing outliers by helping attention heads do nothing

Y Bondarenko, M Nagel… - Advances in Neural …, 2023 - proceedings.neurips.cc
Transformer models have been widely adopted in various domains over the last years and
especially large language models have advanced the field of AI significantly. Due to their …

Deep physical neural networks trained with backpropagation

LG Wright, T Onodera, MM Stein, T Wang… - Nature, 2022 - nature.com
Deep-learning models have become pervasive tools in science and engineering. However,
their energy requirements now increasingly limit their scalability. Deep-learning …

Pruning and quantization for deep neural network acceleration: A survey

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Tinystories: How small can language models be and still speak coherent english?

R Eldan, Y Li - arXiv preprint arXiv:2305.07759, 2023 - arxiv.org
Language models (LMs) are powerful tools for natural language processing, but they often
struggle to produce coherent and fluent text when they are small. Models with around 125M …

Efficient deep learning: A survey on making deep learning models smaller, faster, and better

G Menghani - ACM Computing Surveys, 2023 - dl.acm.org
Deep learning has revolutionized the fields of computer vision, natural language
understanding, speech recognition, information retrieval, and more. However, with the …

Ensemble distillation for robust model fusion in federated learning

T Lin, L Kong, SU Stich, M Jaggi - Advances in neural …, 2020 - proceedings.neurips.cc
Federated Learning (FL) is a machine learning setting where many devices collaboratively
train a machine learning model while keeping the training data decentralized. In most of the …