A comprehensive survey of continual learning: theory, method and application

L Wang, X Zhang, H Su, J Zhu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
To cope with real-world dynamics, an intelligent system needs to incrementally acquire,
update, accumulate, and exploit knowledge throughout its lifetime. This ability, known as …

Continual learning of large language models: A comprehensive survey

H Shi, Z Xu, H Wang, W Qin, W Wang, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
The recent success of large language models (LLMs) trained on static, pre-collected,
general datasets has sparked numerous research directions and applications. One such …

Towards geospatial foundation models via continual pretraining

M Mendieta, B Han, X Shi, Y Zhu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Geospatial technologies are becoming increasingly essential in our world for a wide range
of applications, including agriculture, urban planning, and disaster response. To help …

Knowledge-enhanced event relation extraction via event ontology prompt

L Zhuang, H Fei, P Hu - Information Fusion, 2023 - Elsevier
Identifying temporal and subevent relationships between different events (ie, event relation
extraction) is an important step towards event-centric natural language processing, which …

[HTML][HTML] Continual pre-training mitigates forgetting in language and vision

A Cossu, A Carta, L Passaro, V Lomonaco… - Neural Networks, 2024 - Elsevier
Pre-trained models are commonly used in Continual Learning to initialize the model before
training on the stream of non-stationary data. However, pre-training is rarely applied during …

Test of time: Instilling video-language models with a sense of time

P Bagad, M Tapaswi… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Modelling and understanding time remains a challenge in contemporary video
understanding models. With language emerging as a key driver towards powerful …

Continual Pre-Training of Large Language Models: How to (re) warm your model?

K Gupta, B Thérien, A Ibrahim, ML Richter… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) are routinely pre-trained on billions of tokens, only to restart
the process over again once new data becomes available. A much cheaper and more …

Syntax-based dynamic latent graph for event relation extraction

L Zhuang, H Fei, P Hu - Information Processing & Management, 2023 - Elsevier
This paper focuses on extracting temporal and parent–child relationships between news
events in social news. Previous methods have proved that syntactic features are valid …

Selecting optimal context sentences for event-event relation extraction

H Man, NT Ngo, LN Van, TH Nguyen - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Understanding events entails recognizing the structural and temporal orders between event
mentions to build event structures/graphs for input documents. To achieve this goal, our …

Timers: document-level temporal relation extraction

P Mathur, R Jain, F Dernoncourt… - Proceedings of the …, 2021 - aclanthology.org
We present TIMERS-a TIME, Rhetorical and Syntactic-aware model for document-level
temporal relation classification in the English language. Our proposed method leverages …