Advancing transformer architecture in long-context large language models: A comprehensive survey

Y Huang, J Xu, J Lai, Z Jiang, T Chen, Z Li… - arXiv preprint arXiv …, 2023 - arxiv.org
With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs)
have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been …

Retrieval-augmented generation for natural language processing: A survey

S Wu, Y Xiong, Y Cui, H Wu, C Chen, Y Yuan… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have demonstrated great success in various fields,
benefiting from their huge amount of parameters that store knowledge. However, LLMs still …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

G Team, P Georgiev, VI Lei, R Burnell, L Bai… - arXiv preprint arXiv …, 2024 - arxiv.org
In this report, we introduce the Gemini 1.5 family of models, representing the next generation
of highly compute-efficient multimodal models capable of recalling and reasoning over fine …

Generating images with multimodal language models

JY Koh, D Fried… - Advances in Neural …, 2024 - proceedings.neurips.cc
We propose a method to fuse frozen text-only large language models (LLMs) with pre-
trained image encoder and decoder models, by mapping between their embedding spaces …

Leandojo: Theorem proving with retrieval-augmented language models

K Yang, A Swope, A Gu, R Chalamala… - Advances in …, 2024 - proceedings.neurips.cc
Large language models (LLMs) have shown promise in proving formal theorems using proof
assistants such as Lean. However, existing methods are difficult to reproduce or build on …

A survey on rag meeting llms: Towards retrieval-augmented large language models

W Fan, Y Ding, L Ning, S Wang, H Li, D Yin… - Proceedings of the 30th …, 2024 - dl.acm.org
As one of the most advanced techniques in AI, Retrieval-Augmented Generation (RAG) can
offer reliable and up-to-date external knowledge, providing huge convenience for numerous …

Augmenting language models with long-term memory

W Wang, L Dong, H Cheng, X Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc
Existing large language models (LLMs) can only afford fix-sized inputs due to the input
length limit, preventing them from utilizing rich long-context information from past inputs. To …

Focused transformer: Contrastive training for context scaling

S Tworkowski, K Staniszewski… - Advances in …, 2024 - proceedings.neurips.cc
Large language models have an exceptional capability to incorporate new information in a
contextual manner. However, the full potential of such an approach is often restrained due to …

Emergent abilities of large language models

J Wei, Y Tay, R Bommasani, C Raffel, B Zoph… - arXiv preprint arXiv …, 2022 - arxiv.org
Scaling up language models has been shown to predictably improve performance and
sample efficiency on a wide range of downstream tasks. This paper instead discusses an …