A review of multimodal explainable artificial intelligence: Past, present and future

S Sun, W An, F Tian, F Nan, Q Liu, J Liu, N Shah… - arXiv preprint arXiv …, 2024 - arxiv.org
Artificial intelligence (AI) has rapidly developed through advancements in computational
power and the growth of massive datasets. However, this progress has also heightened …

Survey of different large language model architectures: Trends, benchmarks, and challenges

M Shao, A Basit, R Karri, M Shafique - IEEE Access, 2024 - ieeexplore.ieee.org
Large Language Models (LLMs) represent a class of deep learning models adept at
understanding natural language and generating coherent responses to various prompts or …

[HTML][HTML] Automating Systematic Literature Reviews with Retrieval-Augmented Generation: A Comprehensive Overview

B Han, T Susnjak, A Mathrani - Applied Sciences, 2024 - mdpi.com
This study examines Retrieval-Augmented Generation (RAG) in large language models
(LLMs) and their significant application for undertaking systematic literature reviews (SLRs) …

Mathscape: Evaluating mllms in multimodal math scenarios through a hierarchical benchmark

M Zhou, H Liang, T Li, Z Wu, M Lin, L Sun… - arXiv preprint arXiv …, 2024 - arxiv.org
With the development of Multimodal Large Language Models (MLLMs), the evaluation of
multimodal models in the context of mathematical problems has become a valuable …

The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

Z Qin, D Chen, W Zhang, L Yao, Y Huang… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid development of large language models (LLMs) has been witnessed in recent
years. Based on the powerful LLMs, multi-modal LLMs (MLLMs) extend the modality from …

Mammoth-vl: Eliciting multimodal reasoning with instruction tuning at scale

J Guo, T Zheng, Y Bai, B Li, Y Wang, K Zhu, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Open-source multimodal large language models (MLLMs) have shown significant potential
in a broad range of multimodal tasks. However, their reasoning capabilities remain …

Data-juicer sandbox: A comprehensive suite for multimodal data-model co-development

D Chen, H Wang, Y Huang, C Ge, Y Li, B Ding… - arXiv preprint arXiv …, 2024 - arxiv.org
The emergence of large-scale multi-modal generative models has drastically advanced
artificial intelligence, introducing unprecedented levels of performance and functionality …

Synthvlm: High-efficiency and high-quality synthetic data for vision language models

Z Liu, H Liang, X Huang, W Xiong, Q Yu, L Sun… - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, with the rise of web images, managing and understanding large-scale image
datasets has become increasingly important. Vision Large Language Models (VLLMs) have …

Keyvideollm: Towards large-scale video keyframe selection

H Liang, J Li, T Bai, X Huang, L Sun, Z Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
Recently, with the rise of web videos, managing and understanding large-scale video
datasets has become increasingly important. Video Large Language Models (VideoLLMs) …

Synth-empathy: Towards high-quality synthetic empathy data

H Liang, L Sun, J Wei, X Huang, L Sun, B Yu… - arXiv preprint arXiv …, 2024 - arxiv.org
In recent years, with the rapid advancements in large language models (LLMs), achieving
excellent empathetic response capabilities has become a crucial prerequisite …