Mlperf training benchmark

CJ Wu, R Raghavendra, U Gupta… - Proceedings of …, 2022 - proceedings.mlsys.org

This paper explores the environmental impact of the super-linear growth trends for AI from a
holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize the …

被引用次数：305 相关文章所有 7 个版本

[PDF] nsf.gov

Aligning artificial intelligence with climate change mitigation

LH Kaack, PL Donti, E Strubell, G Kamiya… - Nature Climate …, 2022 - nature.com

There is great interest in how the growth of artificial intelligence and machine learning may
affect global GHG emissions. However, such emissions impacts remain uncertain, owing in …

被引用次数：189 相关文章所有 11 个版本

[PDF] neurips.cc

Flashattention: Fast and memory-efficient exact attention with io-awareness

T Dao, D Fu, S Ermon, A Rudra… - Advances in Neural …, 2022 - proceedings.neurips.cc

Transformers are slow and memory-hungry on long sequences, since the time and memory
complexity of self-attention are quadratic in sequence length. Approximate attention …

被引用次数：905 相关文章所有 9 个版本

[PDF] arxiv.org

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

被引用次数：3123 相关文章所有 2 个版本

[PDF] jmlr.org

Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks

T Hoefler, D Alistarh, T Ben-Nun, N Dryden… - Journal of Machine …, 2021 - jmlr.org

The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …

被引用次数：619 相关文章所有 24 个版本

[PDF] arxiv.org

Efficient large-scale language model training on gpu clusters using megatron-lm

D Narayanan, M Shoeybi, J Casper… - Proceedings of the …, 2021 - dl.acm.org

Large language models have led to state-of-the-art accuracies across several tasks.
However, training these models efficiently is challenging because: a) GPU memory capacity …

被引用次数：458 相关文章所有 11 个版本

[PDF] acm.org

Neural collaborative filtering vs. matrix factorization revisited

S Rendle, W Krichene, L Zhang… - Proceedings of the 14th …, 2020 - dl.acm.org

Embedding based models have been the state of the art in collaborative filtering for over a
decade. Traditionally, the dot product or higher order equivalents have been used to …

被引用次数：416 相关文章所有 7 个版本

[PDF] neurips.cc

Dataperf: Benchmarks for data-centric ai development

M Mazumder, C Banbury, X Yao… - Advances in …, 2024 - proceedings.neurips.cc

Abstract Machine learning research has long focused on models rather than datasets, and
prominent datasets are used for common ML tasks without regard to the breadth, difficulty …

被引用次数：86 相关文章所有 4 个版本

[PDF] mlr.press

Fedscale: Benchmarking model and system performance of federated learning at scale

F Lai, Y Dai, S Singapuram, J Liu… - International …, 2022 - proceedings.mlr.press

We present FedScale, a federated learning (FL) benchmarking suite with realistic datasets
and a scalable runtime to enable reproducible FL research. FedScale datasets encompass …

被引用次数：180 相关文章所有 19 个版本

Nvidia a100 tensor core gpu: Performance and innovation

J Choquette, W Gandhi, O Giroux, N Stam… - IEEE Micro, 2021 - ieeexplore.ieee.org

NVIDIA A100 Tensor Core GPU is NVIDIA's latest flagship GPU. It has been designed with
many new innovative features to provide performance and capabilities for HPC, AI, and data …

被引用次数：264 相关文章所有 4 个版本

高级搜索

QQ 群

Sustainable ai: Environmental implications, challenges and opportunities