Memorizing transformers

Y Huang, J Xu, J Lai, Z Jiang, T Chen, Z Li… - arXiv preprint arXiv …, 2023 - arxiv.org

With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs)
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …

被引用次数：35 相关文章所有 2 个版本

[PDF] arxiv.org

Retrieval-augmented generation for natural language processing: A survey

S Wu, Y Xiong, Y Cui, H Wu, C Chen, Y Yuan… - arXiv preprint arXiv …, 2024 - arxiv.org

Large language models (LLMs) have demonstrated great success in various fields,
benefiting from their huge amount of parameters that store knowledge. However, LLMs still …

被引用次数：19 相关文章所有 4 个版本

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

被引用次数：3425 相关文章所有 4 个版本

[PDF] arxiv.org

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

G Team, P Georgiev, VI Lei, R Burnell, L Bai… - arXiv preprint arXiv …, 2024 - arxiv.org

In this report, we introduce the Gemini 1.5 family of models, representing the next generation
of highly compute-efficient multimodal models capable of recalling and reasoning over fine …

被引用次数：883 相关文章所有 4 个版本

[PDF] neurips.cc

Generating images with multimodal language models

JY Koh, D Fried… - Advances in Neural …, 2024 - proceedings.neurips.cc

We propose a method to fuse frozen text-only large language models (LLMs) with pre-
trained image encoder and decoder models, by mapping between their embedding spaces …

被引用次数：235 相关文章所有 8 个版本

[PDF] neurips.cc

Leandojo: Theorem proving with retrieval-augmented language models

K Yang, A Swope, A Gu, R Chalamala… - Advances in …, 2024 - proceedings.neurips.cc

Large language models (LLMs) have shown promise in proving formal theorems using proof
assistants such as Lean. However, existing methods are difficult to reproduce or build on …

被引用次数：204 相关文章所有 9 个版本

[PDF] arxiv.org

A survey on rag meeting llms: Towards retrieval-augmented large language models

W Fan, Y Ding, L Ning, S Wang, H Li, D Yin… - Proceedings of the 30th …, 2024 - dl.acm.org

As one of the most advanced techniques in AI, Retrieval-Augmented Generation (RAG) can
offer reliable and up-to-date external knowledge, providing huge convenience for numerous …

被引用次数：152 相关文章所有 2 个版本

[PDF] neurips.cc

Augmenting language models with long-term memory

W Wang, L Dong, H Cheng, X Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc

Existing large language models (LLMs) can only afford fix-sized inputs due to the input
length limit, preventing them from utilizing rich long-context information from past inputs. To …

被引用次数：155 相关文章所有 7 个版本

[PDF] neurips.cc

Focused transformer: Contrastive training for context scaling

S Tworkowski, K Staniszewski… - Advances in …, 2024 - proceedings.neurips.cc

Large language models have an exceptional capability to incorporate new information in a
contextual manner. However, the full potential of such an approach is often restrained due to …

被引用次数：103 相关文章所有 6 个版本

[PDF] arxiv.org

Emergent abilities of large language models

J Wei, Y Tay, R Bommasani, C Raffel, B Zoph… - arXiv preprint arXiv …, 2022 - arxiv.org

Scaling up language models has been shown to predictably improve performance and
sample efficiency on a wide range of downstream tasks. This paper instead discusses an …

被引用次数：2681 相关文章所有 7 个版本

高级搜索

QQ 群

Advancing transformer architecture in long-context large language models: A comprehensive survey

Retrieval-augmented generation for natural language processing: A survey

A survey of large language models

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Generating images with multimodal language models

Leandojo: Theorem proving with retrieval-augmented language models

A survey on rag meeting llms: Towards retrieval-augmented large language models

Augmenting language models with long-term memory

Focused transformer: Contrastive training for context scaling

Emergent abilities of large language models

引用