Continual learning of large language models: A comprehensive survey

H Shi, Z Xu, H Wang, W Qin, W Wang, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
The recent success of large language models (LLMs) trained on static, pre-collected,
general datasets has sparked numerous research directions and applications. One such …

Mobileclip: Fast image-text models through multi-modal reinforced training

PKA Vasu, H Pouransari, F Faghri… - Proceedings of the …, 2024 - openaccess.thecvf.com
Contrastive pre-training of image-text foundation models such as CLIP demonstrated
excellent zero-shot performance and improved robustness on a wide range of downstream …

BrainWash: A Poisoning Attack to Forget in Continual Learning

A Abbasi, P Nooralinejad… - Proceedings of the …, 2024 - openaccess.thecvf.com
Continual learning has gained substantial attention within the deep learning community
offering promising solutions to the challenging problem of sequential learning. Yet a largely …

Online adaptation of language models with a memory of amortized contexts

J Tack, J Kim, E Mitchell, J Shin, YW Teh… - arXiv preprint arXiv …, 2024 - arxiv.org
Due to the rapid generation and dissemination of information, large language models
(LLMs) quickly run out of date despite enormous development costs. Due to this crucial need …

Preventing Catastrophic Forgetting through Memory Networks in Continuous Detection

G Bhatt, J Ross, L Sigal - arXiv preprint arXiv:2403.14797, 2024 - arxiv.org
Modern pre-trained architectures struggle to retain previous information while undergoing
continuous fine-tuning on new tasks. Despite notable progress in continual classification …

Local vs Global continual learning

G Lanzillotta, SP Singh, BF Grewe… - arXiv preprint arXiv …, 2024 - arxiv.org
Continual learning is the problem of integrating new information in a model while retaining
the knowledge acquired in the past. Despite the tangible improvements achieved in recent …

RanDumb: A Simple Approach that Questions the Efficacy of Continual Representation Learning

A Prabhu, S Sinha, P Kumaraguru, PHS Torr… - arXiv preprint arXiv …, 2024 - arxiv.org
We propose RanDumb to examine the efficacy of continual representation learning.
RanDumb embeds raw pixels using a fixed random transform which approximates an RBF …

Demystifying Forgetting in Language Model Fine-Tuning with Statistical Analysis of Example Associations

X Jin, X Ren - arXiv preprint arXiv:2406.14026, 2024 - arxiv.org
Language models (LMs) are known to suffer from forgetting of previously learned examples
when fine-tuned, breaking stability of deployed LM systems. Despite efforts on mitigating …

[PDF][PDF] Robust Machine Learning: Detection, Evaluation and Adaptation Under Distribution Shift

S Garg - 2024 - kilthub.cmu.edu
Deep learning, despite its broad applicability, grapples with robustness challenges in real-
world applications, especially when training and test distributions differ. Reasons for the …