From Large Language Models to Large Multimodal Models: A Literature Review

D Huang, C Yan, Q Li, X Peng - Applied Sciences, 2024 - mdpi.com
With the deepening of research on Large Language Models (LLMs), significant progress has
been made in recent years on the development of Large Multimodal Models (LMMs), which …

A Survey of Multimodal Large Language Model from A Data-centric Perspective

T Bai, H Liang, B Wan, L Yang, B Li, Y Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Human beings perceive the world through diverse senses such as sight, smell, hearing, and
touch. Similarly, multimodal large language models (MLLMs) enhance the capabilities of …

A Review of Multi-Modal Large Language and Vision Models

K Carolan, L Fennelly, AF Smeaton - arXiv preprint arXiv:2404.01322, 2024 - arxiv.org
Large Language Models (LLMs) have recently emerged as a focal point of research and
application, driven by their unprecedented ability to understand and generate text with …

Mm-llms: Recent advances in multimodal large language models

D Zhang, Y Yu, C Li, J Dong, D Su, C Chu… - arXiv preprint arXiv …, 2024 - arxiv.org
In the past year, MultiModal Large Language Models (MM-LLMs) have undergone
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …

LaVy: Vietnamese Multimodal Large Language Model

C Tran, HL Thanh - arXiv preprint arXiv:2404.07922, 2024 - arxiv.org
Large Language Models (LLMs) and Multimodal Large language models (MLLMs) have
taken the world by storm with impressive abilities in complex reasoning and linguistic …

Imp: Highly Capable Large Multimodal Models for Mobile Devices

Z Shao, Z Yu, J Yu, X Ouyang, L Zheng, Z Gai… - arXiv preprint arXiv …, 2024 - arxiv.org
By harnessing the capabilities of large language models (LLMs), recent large multimodal
models (LMMs) have shown remarkable versatility in open-world multimodal understanding …

X-llm: Bootstrapping advanced large language models by treating multi-modalities as foreign languages

F Chen, M Han, H Zhao, Q Zhang, J Shi, S Xu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable language abilities. GPT-4,
based on advanced LLMs, exhibits extraordinary multimodal capabilities beyond previous …

Exploring the reasoning abilities of multimodal large language models (mllms): A comprehensive survey on emerging trends in multimodal reasoning

Y Wang, W Chen, X Han, X Lin, H Zhao, Y Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Strong Artificial Intelligence (Strong AI) or Artificial General Intelligence (AGI) with abstract
reasoning ability is the goal of next-generation AI. Recent advancements in Large Language …

Model Composition for Multimodal Large Language Models

C Chen, Y Du, Z Fang, Z Wang, F Luo, P Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent developments in Multimodal Large Language Models (MLLMs) have shown rapid
progress, moving towards the goal of creating versatile MLLMs that understand inputs from …

Exploring the Theoretical Dimensions and Intricate Behaviors of Large Language Models and their Multimodal Counterparts

K Desai, S Yadav, R Murugan - 2024 IEEE 13th International …, 2024 - ieeexplore.ieee.org
In recent years, there has been an explosion of development and change in the field of big
language models and their multimodal equivalents. Natural language processing, synthesis …