Mitigating catastrophic forgetting in large language models with self-synthesized rehearsal

M Jovanovic, P Voss - arXiv preprint arXiv:2404.18311, 2024 - arxiv.org

Real-time learning concerns the ability of learning systems to acquire knowledge over time,
enabling their adaptation and generalization to novel tasks. It is a critical ability for …

被引用次数：9 相关文章所有 2 个版本

[PDF] aclanthology.org

Soul-Mix: Enhancing Multimodal Machine Translation with Manifold Mixup

X Cheng, Z Yao, Y Xin, H An, H Li, Y Li… - Proceedings of the 62nd …, 2024 - aclanthology.org

Multimodal machine translation (MMT) aims to improve the performance of machine
translation with the help of visual information, which has received widespread attention …

[PDF] arxiv.org

Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

J Kang, L Karlinsky, H Luo, Z Wang, J Hansen… - arXiv preprint arXiv …, 2024 - arxiv.org

We present Self-MoE, an approach that transforms a monolithic LLM into a compositional,
modular system of self-specialized experts, named MiXSE (MiXture of Self-specialized …

被引用次数：2 相关文章

[PDF] arxiv.org

Mitigating Catastrophic Forgetting in Language Transfer via Model Merging

A Alexandrov, V Raychev, MN Müller, C Zhang… - arXiv preprint arXiv …, 2024 - arxiv.org

As open-weight large language models (LLMs) achieve ever more impressive performances
across a wide range of tasks in English, practitioners aim to adapt these models to different …

Towards Lifelong Learning of Large Language Models: A Survey

J Zheng, S Qiu, C Shi, Q Ma - arXiv preprint arXiv:2406.06391, 2024 - arxiv.org

As the applications of large language models (LLMs) expand across diverse fields, the
ability of these models to adapt to ongoing changes in data, tasks, and user preferences …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector

G Jiang, Z Li, C Jiang, S Xue, J Zhou, L Song… - arXiv preprint arXiv …, 2024 - arxiv.org

Fine-tuning large language models (LLMs) can cause them to lose their general capabilities.
However, the intrinsic mechanisms behind such forgetting remain unexplored. In this paper …

[PDF] arxiv.org

被引用次数：1 相关文章

高级搜索

QQ 群