相关文章- 学术资源搜索

Visualizing Invariant Features in Vision Models

F Sammani, B Joukovsky… - 2023 24th International …, 2023 - ieeexplore.ieee.org

Explainable AI is important for improving transparency, accountability, trust, and ethical
considerations in AI systems, and for enabling users to make informed decisions based on …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Boosting Cross-task Transferability of Adversarial Patches with Visual Relations

T Ma, S Li, Y Xiao, S Liu - arXiv preprint arXiv:2304.05402, 2023 - arxiv.org

The transferability of adversarial examples is a crucial aspect of evaluating the robustness of
deep learning systems, particularly in black-box scenarios. Although several methods have …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Deem: Diffusion models serve as the eyes of large language models for image perception

R Luo, Y Li, L Chen, W He, TE Lin, Z Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

The development of large language models (LLMs) has significantly advanced the
emergence of large multimodal models (LMMs). While LMMs have achieved tremendous …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

VGA: Vision GUI Assistant--Minimizing Hallucinations through Image-Centric Fine-Tuning

Z Meng, Y Dai, Z Gong, S Guo, M Tang… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advances in Large Vision-Language Models (LVLMs) have significantly improve
performance in image comprehension tasks, such as formatted charts and rich-content …

[PDF] arxiv.org

Prescribing the Right Remedy: Mitigating Hallucinations in Large Vision-Language Models via Targeted Instruction Tuning

R Hu, Y Tu, J Sang - arXiv preprint arXiv:2404.10332, 2024 - arxiv.org

Despite achieving outstanding performance on various cross-modal tasks, current large
vision-language models (LVLMs) still suffer from hallucination issues, manifesting as …

Image Authenticity Detection using Eye Gazing Data: A Performance Comparison Beyond Human Capabilities via Attention Mechanism, ResNet, and Cascade …

P Zhang - openreview.net

In the digital age, determining the authenticity of images has become increasingly crucial.
This study aims to explore the capability of machine learning models in identifying …

[PDF] arxiv.org

Detecting adversarial perturbations in multi-task perception

M Klingner, VR Kumar, S Yogamani… - 2022 IEEE/RSJ …, 2022 - ieeexplore.ieee.org

While deep neural networks (DNNs) achieve impressive performance on environment
perception tasks, their sensitivity to adversarial perturbations limits their use in practical …

被引用次数：16 相关文章所有 4 个版本

[PDF] aaai.org

Detecting and preventing hallucinations in large vision language models

A Gunjal, J Yin, E Bas - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Instruction tuned Large Vision Language Models (LVLMs) have significantly advanced in
generalizing across a diverse set of multi-modal tasks, especially for Visual Question …

被引用次数：77 相关文章所有 3 个版本

[PDF] arxiv.org

Ibd: Alleviating hallucinations in large vision-language models via image-biased decoding

L Zhu, D Ji, T Chen, P Xu, J Ye, J Liu - arXiv preprint arXiv:2402.18476, 2024 - arxiv.org

Despite achieving rapid developments and with widespread applications, Large Vision-
Language Models (LVLMs) confront a serious challenge of being prone to generating …

被引用次数：12 相关文章所有 2 个版本

[PDF] arxiv.org

Hal-eval: A universal and fine-grained hallucination evaluation framework for large vision language models

C Jiang, W Ye, M Dong, H Jia, H Xu, M Yan… - arXiv preprint arXiv …, 2024 - arxiv.org

Large Vision Language Models exhibit remarkable capabilities but struggle with
hallucinations inconsistencies between images and their descriptions. Previous …

被引用次数：3 相关文章所有 2 个版本

高级搜索

QQ 群

Visualizing Invariant Features in Vision Models

Boosting Cross-task Transferability of Adversarial Patches with Visual Relations

Deem: Diffusion models serve as the eyes of large language models for image perception

VGA: Vision GUI Assistant--Minimizing Hallucinations through Image-Centric Fine-Tuning

Prescribing the Right Remedy: Mitigating Hallucinations in Large Vision-Language Models via Targeted Instruction Tuning

Image Authenticity Detection using Eye Gazing Data: A Performance Comparison Beyond Human Capabilities via Attention Mechanism, ResNet, and Cascade …

Detecting adversarial perturbations in multi-task perception

Detecting and preventing hallucinations in large vision language models

Ibd: Alleviating hallucinations in large vision-language models via image-biased decoding

Hal-eval: A universal and fine-grained hallucination evaluation framework for large vision language models

引用