T Ma, S Li, Y Xiao, S Liu - arXiv preprint arXiv:2304.05402, 2023 - arxiv.org
The transferability of adversarial examples is a crucial aspect of evaluating the robustness of deep learning systems, particularly in black-box scenarios. Although several methods have …
The development of large language models (LLMs) has significantly advanced the emergence of large multimodal models (LMMs). While LMMs have achieved tremendous …
Z Meng, Y Dai, Z Gong, S Guo, M Tang… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advances in Large Vision-Language Models (LVLMs) have significantly improve performance in image comprehension tasks, such as formatted charts and rich-content …
R Hu, Y Tu, J Sang - arXiv preprint arXiv:2404.10332, 2024 - arxiv.org
Despite achieving outstanding performance on various cross-modal tasks, current large vision-language models (LVLMs) still suffer from hallucination issues, manifesting as …
In the digital age, determining the authenticity of images has become increasingly crucial. This study aims to explore the capability of machine learning models in identifying …
While deep neural networks (DNNs) achieve impressive performance on environment perception tasks, their sensitivity to adversarial perturbations limits their use in practical …
A Gunjal, J Yin, E Bas - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org
Instruction tuned Large Vision Language Models (LVLMs) have significantly advanced in generalizing across a diverse set of multi-modal tasks, especially for Visual Question …
Despite achieving rapid developments and with widespread applications, Large Vision- Language Models (LVLMs) confront a serious challenge of being prone to generating …
C Jiang, W Ye, M Dong, H Jia, H Xu, M Yan… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Vision Language Models exhibit remarkable capabilities but struggle with hallucinations inconsistencies between images and their descriptions. Previous …