Mm-llms: Recent advances in multimodal large language models

D Zhang, Y Yu, J Dong, C Li, D Su, C Chu… - arXiv preprint arXiv …, 2024 - arxiv.org
In the past year, MultiModal Large Language Models (MM-LLMs) have undergone
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …

A comprehensive survey of large language models and multimodal large language models in medicine

H Xiao, F Zhou, X Liu, T Liu, Z Li, X Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Since the release of ChatGPT and GPT-4, large language models (LLMs) and multimodal
large language models (MLLMs) have garnered significant attention due to their powerful …

Continual llava: Continual instruction tuning in large vision-language models

M Cao, Y Liu, Y Liu, T Wang, J Dong, H Ding… - arXiv preprint arXiv …, 2024 - arxiv.org
Instruction tuning constitutes a prevalent technique for tailoring Large Vision Language
Models (LVLMs) to meet individual task requirements. To date, most of the existing …

Llms can evolve continually on modality for x-modal reasoning

J Yu, H Xiong, L Zhang, H Diao, Y Zhuge… - arXiv preprint arXiv …, 2024 - arxiv.org
Multimodal Large Language Models (MLLMs) have gained significant attention due to their
impressive capabilities in multimodal understanding. However, existing methods rely heavily …

Modality-Inconsistent Continual Learning of Multimodal Large Language Models

W Pian, S Deng, S Mo, Y Guo, Y Tian - arXiv preprint arXiv:2412.13050, 2024 - arxiv.org
In this paper, we introduce Modality-Inconsistent Continual Learning (MICL), a new
continual learning scenario for Multimodal Large Language Models (MLLMs) that involves …

Recent Advances of Multimodal Continual Learning: A Comprehensive Survey

D Yu, X Zhang, Y Chen, A Liu, Y Zhang, PS Yu… - arXiv preprint arXiv …, 2024 - arxiv.org
Continual learning (CL) aims to empower machine learning models to learn continually from
new data, while building upon previously acquired knowledge without forgetting. As …

One VLM to Keep it Learning: Generation and Balancing for Data-free Continual Visual Question Answering

D Das, D Talon, M Mancini, Y Wang, E Ricci - arXiv preprint arXiv …, 2024 - arxiv.org
Vision-Language Models (VLMs) have shown significant promise in Visual Question
Answering (VQA) tasks by leveraging web-scale multimodal datasets. However, these …

LLaCA: Multimodal Large Language Continual Assistant

J Qiao, Z Zhang, X Tan, Y Qu, S Ding, Y Xie - arXiv preprint arXiv …, 2024 - arxiv.org
Instruction tuning guides the Multimodal Large Language Models (MLLMs) in aligning
different modalities by designing text instructions, which seems to be an essential technique …

Towards Lifelong Learning of Large Language Models: A Survey

J Zheng, S Qiu, C Shi, Q Ma - arXiv preprint arXiv:2406.06391, 2024 - arxiv.org
As the applications of large language models (LLMs) expand across diverse fields, the
ability of these models to adapt to ongoing changes in data, tasks, and user preferences …

[HTML][HTML] Multi-Task Diffusion Learning for Time Series Classification

S Zheng, Z Liu, L Tian, L Ye, S Zheng, P Peng, W Chu - Electronics, 2024 - mdpi.com
Current deep learning models for time series often face challenges with generalizability in
scenarios characterized by limited samples or inadequately labeled data. By tapping into the …