Recent development of Large Vision-Language Models (LVLMs) has attracted growing attention within the AI landscape for its practical implementation potential. However,`` …
Visual language models (VLMs) rapidly progressed with the recent success of large language models. There have been growing efforts on visual instruction tuning to extend the …
Abstract Large Vision-Language Models (LVLMs) have advanced considerably intertwining visual recognition and language understanding to generate content that is not only coherent …
T Yu, Y Yao, H Zhang, T He, Y Han… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Multimodal Large Language Models (MLLMs) have recently demonstrated impressive capabilities in multimodal understanding reasoning and interaction. However …
We present DRESS a large vision language model (LVLM) that innovatively exploits Natural Language feedback (NLF) from Large Language Models to enhance its alignment and …
Y Zhu, M Zhu, N Liu, Z Ou, X Mou, J Tang - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we introduce LLaVA-$\phi $(LLaVA-Phi), an efficient multi-modal assistant that harnesses the power of the recently advanced small language model, Phi-2, to facilitate …
This survey presents a comprehensive analysis of the phenomenon of hallucination in multimodal large language models (MLLMs), also known as Large Vision-Language Models …
T Cui, Y Wang, C Fu, Y Xiao, S Li, X Deng, Y Liu… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have strong capabilities in solving diverse natural language processing tasks. However, the safety and security issues of LLM systems have become the …
Instruction-following Vision Large Language Models (VLLMs) have achieved significant progress recently on a variety of tasks. These approaches merge strong pre-trained vision …