Towards Deconfounded Visual Question Answering via Dual-causal Intervention

D Peng, W Wei - Proceedings of the 33rd ACM International Conference …, 2024 - dl.acm.org
The Visual Question Answering (VQA) task has recently become notorious because models
are prone to predicting well-educated" guesses" as answers rather than deriving them …

Combating Visual Question Answering Hallucinations via Robust Multi-Space Co-Debias Learning

J Zhu, Y Liu, H Zhu, H Lin, Y Jiang, Z Zhang… - Proceedings of the 32nd …, 2024 - dl.acm.org
The challenge of bias in visual question answering (VQA) has gained considerable attention
in contemporary research. Various intricate bias dependencies, such as modalities and data …

Generative Adversarial Networks with Learnable Auxiliary Module for Image Synthesis

Y Gan, C Yang, M Ye, R Huang, D Ouyang - ACM Transactions on …, 2024 - dl.acm.org
Training generative adversarial networks (GANs) for noise-to-image synthesis is a challenge
task, primarily due to the instability of GANs' training process. One of the key issues is the …

Robust visual question answering: Datasets, methods, and future challenges

J Ma, P Wang, D Kong, Z Wang, J Liu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Visual question answering requires a system to provide an accurate natural language
answer given an image and a natural language question. However, it is widely recognized …

GRACE: Graph-Based Contextual Debiasing for Fair Visual Question Answering

Y Zhang, M Jiang, Q Zhao - European Conference on Computer Vision, 2024 - Springer
Large language models (LLMs) exhibit exceptional reasoning capabilities and have played
significant roles in knowledge-based visual question-answering (VQA) systems. By …

Answering, Fast and Slow: Strategy enhancement of visual understanding guided by causality

C Wang, Z Wang, Y Zhou - Neurocomputing, 2025 - Elsevier
In his classic book Thinking, Fast and Slow (Daniel, 2017), Kahneman points out that human
thinking can be categorized into two main modes of thinking: a system that displays intuition …

Language-guided Bias Generation Contrastive Strategy for Visual Question Answering

E Zhao, N Song, Z Zhang, J Nie, X Liang… - ACM Transactions on …, 2025 - dl.acm.org
Visual question answering (VQA) is a challenging task that requires models to understand
both visual and linguistic inputs and produce accurate answers. However, VQA models often …

Overcoming Language Priors for Visual Question Answering Based on Knowledge Distillation

D Peng, W Wei - … Conference on Multimedia and Expo (ICME), 2024 - ieeexplore.ieee.org
Previous studies have pointed out that visual question answering (VQA) models are prone to
relying on language priors for answer predictions. In this context, predictions often depend …

Robust Visual Question Answering With Contrastive-Adversarial Consistency Constraints

J Zhu, M Ding, Y Liu, B Zeng, G Lu… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Visual cues and question semantics contribute to final answer predictions from distinct
perspectives. However, inherent language bias confounds the relationship between visual …

Collaborative Modality Fusion for Mitigating Language Bias in Visual Question Answering

Q Lu, S Chen, X Zhu - Journal of Imaging, 2024 - mdpi.com
Language bias stands as a noteworthy concern in visual question answering (VQA),
wherein models tend to rely on spurious correlations between questions and answers for …