Modeling the bandwidth of perceptual experience using deep convolutional neural networks

M Cohen, K Lydic, NAR Murty - 2022 - europepmc.org
When observers glance upon a natural scene, which aspects of that scene ultimately reach
perceptual awareness? To answer this question, we showed observers images of scenes …

Bidirectional Contrastive Split Learning for Visual Question Answering

Y Sun, H Ochiai - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Visual Question Answering (VQA) based on multi-modal data facilitates real-life applications
such as home robots and medical diagnoses. One significant challenge is to devise a robust …

[PDF][PDF] Understanding the Dependence of Perception Model Competency on Regions in an Image

S Pohland, C Tomlin - people.eecs.berkeley.edu
While deep neural network (DNN)-based perception models are useful for many
applications, these models are black boxes and their outputs are not yet well understood. To …

Phd: A prompted visual hallucination evaluation dataset

J Liu, Y Fu, R Xie, R Xie, X Sun, F Lian, Z Kang… - arXiv preprint arXiv …, 2024 - arxiv.org
The rapid growth of Large Language Models (LLMs) has driven the development of Large
Vision-Language Models (LVLMs). The challenge of hallucination, prevalent in LLMs, also …

Aligning large multi-modal model with robust instruction tuning

F Liu, K Lin, L Li, J Wang, Y Yacoob, L Wang - arXiv preprint arXiv …, 2023 - arxiv.org
Despite the promising progress in multi-modal tasks, current large multi-modal models
(LMM) are prone to hallucinating inconsistent descriptions with respect to the associated …

[PDF][PDF] Pragmatic Analytics on Hybrid Computer Vision Models to Develop a Stable Framework for Visual Impairment–A Survey

S Sajini, B Pushpa - researchgate.net
Computer vision is an expertise associated with artificial intelligence and image processing
arenas to excerpt meaningful information from visual components like images and videos …

Mitigating hallucination in visual language models with visual supervision

Z Chen, Y Zhu, Y Zhan, Z Li, C Zhao, J Wang… - arXiv preprint arXiv …, 2023 - arxiv.org
Large vision-language models (LVLMs) suffer from hallucination a lot, generating responses
that apparently contradict to the image content occasionally. The key problem lies in its weak …

[PDF][PDF] Feature-disentangled reconstruction of perception from multi-unit recording

TM Dado, P Papale, A Lozano, L Le, MAJ van Gerven… - 2023 - repository.ubn.ru.nl
Here, we aimed to explain neural representations of perception, for which we analyzed the
relationship between multi-unit activity (MUA) recorded from the primate brain and various …

Measuring Noticeability: Multi-scale Context Aggregation for Prioritizing Video Anomalies

Y Zhong, EV Doggett, W Cui, K Qi… - … Joint Conference on …, 2022 - ieeexplore.ieee.org
Determining how impactful an anomaly is on the viewing experience of an audience is
important to production studios, content creators, and content distributors. However, judging …

Deep learning interpretability with visual analytics: Exploring reasoning and bias exploitation

T Jaunet - 2022 - theses.hal.science
Over the past few years, AI and machine learning have evolved from research areas
secluded in laboratories far from the public, to technologies deployed on an industrial scale …