Towards Lifelong Learning of Large Language Models: A Survey

J Zheng, S Qiu, C Shi, Q Ma - arXiv preprint arXiv:2406.06391, 2024 - arxiv.org
J Zheng, S Qiu, C Shi, Q Ma
arXiv preprint arXiv:2406.06391, 2024arxiv.org
As the applications of large language models (LLMs) expand across diverse fields, the
ability of these models to adapt to ongoing changes in data, tasks, and user preferences
becomes crucial. Traditional training methods, relying on static datasets, are increasingly
inadequate for coping with the dynamic nature of real-world information. Lifelong learning,
also known as continual or incremental learning, addresses this challenge by enabling
LLMs to learn continuously and adaptively over their operational lifetime, integrating new …
As the applications of large language models (LLMs) expand across diverse fields, the ability of these models to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods, relying on static datasets, are increasingly inadequate for coping with the dynamic nature of real-world information. Lifelong learning, also known as continual or incremental learning, addresses this challenge by enabling LLMs to learn continuously and adaptively over their operational lifetime, integrating new knowledge while retaining previously learned information and preventing catastrophic forgetting. This survey delves into the sophisticated landscape of lifelong learning, categorizing strategies into two primary groups: Internal Knowledge and External Knowledge. Internal Knowledge includes continual pretraining and continual finetuning, each enhancing the adaptability of LLMs in various scenarios. External Knowledge encompasses retrieval-based and tool-based lifelong learning, leveraging external data sources and computational tools to extend the model's capabilities without modifying core parameters. The key contributions of our survey are: (1) Introducing a novel taxonomy categorizing the extensive literature of lifelong learning into 12 scenarios; (2) Identifying common techniques across all lifelong learning scenarios and classifying existing literature into various technique groups within each scenario; (3) Highlighting emerging techniques such as model expansion and data selection, which were less explored in the pre-LLM era. Through a detailed examination of these groups and their respective categories, this survey aims to enhance the adaptability, reliability, and overall performance of LLMs in real-world applications.
arxiv.org
以上显示的是最相近的搜索结果。 查看全部搜索结果

Google学术搜索按钮

example.edu/paper.pdf
查找
获取 PDF 文件
引用
References