The multi-modal fusion in visual question answering: a review of attention mechanisms

S Lu, M Liu, L Yin, Z Yin, X Liu, W Zheng - PeerJ Computer Science, 2023 - peerj.com
Abstract Visual Question Answering (VQA) is a significant cross-disciplinary issue in the
fields of computer vision and natural language processing that requires a computer to output …

An analysis of graph convolutional networks and recent datasets for visual question answering

AA Yusuf, F Chong, M Xianling - Artificial Intelligence Review, 2022 - Springer
Graph neural network is a deep learning approach widely applied on structural and non-
structural scenarios due to its substantial performance and interpretability recently. In a non …

A survey of methods, datasets and evaluation metrics for visual question answering

H Sharma, AS Jalal - Image and Vision Computing, 2021 - Elsevier
Abstract Visual Question Answering (VQA) is a multi-disciplinary research problem that has
captured the attention of both computer vision as well as natural language processing …

An improved attention and hybrid optimization technique for visual question answering

H Sharma, AS Jalal - Neural Processing Letters, 2022 - Springer
Abstract In Visual Question Answering (VQA), an attention mechanism has a critical role in
specifying the different objects present in an image or tells the machine where to focus by …

Image captioning improved visual question answering

H Sharma, AS Jalal - Multimedia tools and applications, 2022 - Springer
Abstract Both Visual Question Answering (VQA) and image captioning are the problems
which involve Computer Vision (CV) and Natural Language Processing (NLP) domains. In …

Graph neural networks for visual question answering: a systematic review

AA Yusuf, C Feng, X Mao, R Ally Duma… - Multimedia Tools and …, 2024 - Springer
Recently, visual question answering (VQA) has gained considerable interest within the
computer vision and natural language processing (NLP) research areas. The VQA task …

Vqa and visual reasoning: An overview of recent datasets, methods and challenges

RY Zakari, JW Owusu, H Wang, K Qin, ZK Lawal… - arXiv preprint arXiv …, 2022 - arxiv.org
Artificial Intelligence (AI) and its applications have sparked extraordinary interest in recent
years. This achievement can be ascribed in part to advances in AI subfields including …

A COVID-19 X-ray image classification model based on an enhanced convolutional neural network and hill climbing algorithms

AK Pradhan, D Mishra, K Das, MS Obaidat… - Multimedia Tools and …, 2023 - Springer
The classification of medical images is significant among researchers and physicians for the
early identification and clinical treatment of many disorders. Though, traditional classifiers …

Research on visual question answering based on GAT relational reasoning

Y Miao, W Cheng, S He, H Jiang - Neural Processing Letters, 2022 - Springer
Due to the diversity of questions in VQA, it brings new challenges to the construction of VQA
model. Existing VQA models focus on constructing a new attention mechanism, which …

Evaluation of graph convolutional networks performance for visual question answering on reasoning datasets

AA Yusuf, F Chong, M Xianling - Multimedia Tools and Applications, 2022 - Springer
In the recent era, graph neural networks are widely used on vision-to-language tasks and
achieved promising results. In particular, graph convolution network (GCN) is capable of …