- 学术资源搜索

Self-supervised speech representation learning: A review

A Mohamed, H Lee, L Borgholt… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …

被引用次数：394 相关文章所有 10 个版本

[PDF] arxiv.org

Pre-trained language models for text generation: A survey

J Li, T Tang, WX Zhao, JY Nie, JR Wen - ACM Computing Surveys, 2024 - dl.acm.org

Text Generation aims to produce plausible and readable text in human language from input
data. The resurgence of deep learning has greatly advanced this field, in particular, with the …

被引用次数：394 相关文章所有 7 个版本

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

被引用次数：3328 相关文章所有 4 个版本

[PDF] mlsys.org

AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration

J Lin, J Tang, H Tang, S Yang… - Proceedings of …, 2024 - proceedings.mlsys.org

Large language models (LLMs) have shown excellent performance on various tasks, but the
astronomical model size raises the hardware barrier for serving (memory size) and slows …

被引用次数：568 相关文章所有 3 个版本

[PDF] mit.edu

A survey on model compression for large language models

X Zhu, J Li, Y Liu, C Ma, W Wang - Transactions of the Association for …, 2024 - direct.mit.edu

Abstract Large Language Models (LLMs) have transformed natural language processing
tasks successfully. Yet, their large size and high computational needs pose challenges for …

被引用次数：212 相关文章所有 2 个版本

[PDF] neurips.cc

Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale

T Dettmers, M Lewis, Y Belkada… - Advances in Neural …, 2022 - proceedings.neurips.cc

Large language models have been widely adopted but require significant GPU memory for
inference. We develop a procedure for Int8 matrix multiplication for feed-forward and …

被引用次数：917 相关文章所有 6 个版本

[PDF] arxiv.org

Gptq: Accurate post-training quantization for generative pre-trained transformers

E Frantar, S Ashkboos, T Hoefler, D Alistarh - arXiv preprint arXiv …, 2022 - arxiv.org

Generative Pre-trained Transformer models, known as GPT or OPT, set themselves apart
through breakthrough performance across complex language modelling tasks, but also by …

被引用次数：684 相关文章所有 19 个版本

[PDF] neurips.cc

Zeroquant: Efficient and affordable post-training quantization for large-scale transformers

Z Yao, R Yazdani Aminabadi… - Advances in …, 2022 - proceedings.neurips.cc

How to efficiently serve ever-larger trained natural language models in practice has become
exceptionally challenging even for powerful cloud servers due to their prohibitive …

被引用次数：365 相关文章所有 7 个版本

[PDF] arxiv.org

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arXiv preprint arXiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

被引用次数：4541 相关文章所有 2 个版本

[PDF] neurips.cc

Quantizable transformers: Removing outliers by helping attention heads do nothing

Y Bondarenko, M Nagel… - Advances in Neural …, 2023 - proceedings.neurips.cc

Transformer models have been widely adopted in various domains over the last years and
especially large language models have advanced the field of AI significantly. Due to their …

被引用次数：72 相关文章所有 5 个版本

高级搜索

QQ 群

Self-supervised speech representation learning: A review

Pre-trained language models for text generation: A survey

A survey of large language models

AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration

A survey on model compression for large language models

Gpt3. int8 (): 8-bit matrix multiplication for transformers at scale

Gptq: Accurate post-training quantization for generative pre-trained transformers

Zeroquant: Efficient and affordable post-training quantization for large-scale transformers

On the opportunities and risks of foundation models

Quantizable transformers: Removing outliers by helping attention heads do nothing

引用