Sustainable ai: Environmental implications, challenges and opportunities

CJ Wu, R Raghavendra, U Gupta… - Proceedings of …, 2022 - proceedings.mlsys.org
This paper explores the environmental impact of the super-linear growth trends for AI from a
holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize the …

Aligning artificial intelligence with climate change mitigation

LH Kaack, PL Donti, E Strubell, G Kamiya… - Nature Climate …, 2022 - nature.com
There is great interest in how the growth of artificial intelligence and machine learning may
affect global GHG emissions. However, such emissions impacts remain uncertain, owing in …

Flashattention: Fast and memory-efficient exact attention with io-awareness

T Dao, D Fu, S Ermon, A Rudra… - Advances in Neural …, 2022 - proceedings.neurips.cc
Transformers are slow and memory-hungry on long sequences, since the time and memory
complexity of self-attention are quadratic in sequence length. Approximate attention …

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

Efficient large-scale language model training on gpu clusters using megatron-lm

D Narayanan, M Shoeybi, J Casper… - Proceedings of the …, 2021 - dl.acm.org
Large language models have led to state-of-the-art accuracies across several tasks.
However, training these models efficiently is challenging because: a) GPU memory capacity …

Neural collaborative filtering vs. matrix factorization revisited

S Rendle, W Krichene, L Zhang… - Proceedings of the 14th …, 2020 - dl.acm.org
Embedding based models have been the state of the art in collaborative filtering for over a
decade. Traditionally, the dot product or higher order equivalents have been used to …

Dataperf: Benchmarks for data-centric ai development

M Mazumder, C Banbury, X Yao… - Advances in …, 2024 - proceedings.neurips.cc
Abstract Machine learning research has long focused on models rather than datasets, and
prominent datasets are used for common ML tasks without regard to the breadth, difficulty …

Fedscale: Benchmarking model and system performance of federated learning at scale

F Lai, Y Dai, S Singapuram, J Liu… - International …, 2022 - proceedings.mlr.press
We present FedScale, a federated learning (FL) benchmarking suite with realistic datasets
and a scalable runtime to enable reproducible FL research. FedScale datasets encompass …

Nvidia a100 tensor core gpu: Performance and innovation

J Choquette, W Gandhi, O Giroux, N Stam… - IEEE Micro, 2021 - ieeexplore.ieee.org
NVIDIA A100 Tensor Core GPU is NVIDIA's latest flagship GPU. It has been designed with
many new innovative features to provide performance and capabilities for HPC, AI, and data …