Swapmix: Diagnosing and regularizing the over-reliance on visual context in visual question answering

V Gupta, Z Li, A Kortylewski, C Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract While Visual Question Answering (VQA) has progressed rapidly, previous works
raise concerns about robustness of current VQA models. In this work, we study the …

Graph neural networks in vision-language image understanding: A survey

H Senior, G Slabaugh, S Yuan, L Rossi - The Visual Computer, 2024 - Springer
Abstract 2D image understanding is a complex problem within computer vision, but it holds
the key to providing human-level scene comprehension. It goes further than identifying the …

Graph neural networks for visual question answering: a systematic review

AA Yusuf, C Feng, X Mao, R Ally Duma… - Multimedia Tools and …, 2024 - Springer
Recently, visual question answering (VQA) has gained considerable interest within the
computer vision and natural language processing (NLP) research areas. The VQA task …

A survey of methods, datasets and evaluation metrics for visual question answering

H Sharma, AS Jalal - Image and Vision Computing, 2021 - Elsevier
Abstract Visual Question Answering (VQA) is a multi-disciplinary research problem that has
captured the attention of both computer vision as well as natural language processing …

Surgical-vqa: Visual question answering in surgical scenes using transformer

L Seenivasan, M Islam, AK Krishna, H Ren - International Conference on …, 2022 - Springer
Visual question answering (VQA) in surgery is largely unexplored. Expert surgeons are
scarce and are often overloaded with clinical and academic workloads. This overload often …

An improved attention and hybrid optimization technique for visual question answering

H Sharma, AS Jalal - Neural Processing Letters, 2022 - Springer
Abstract In Visual Question Answering (VQA), an attention mechanism has a critical role in
specifying the different objects present in an image or tells the machine where to focus by …

[HTML][HTML] Reliability analysis of reinforced soil slope stability using GA-ANFIS, RFC, and GMDH soft computing techniques

R Ray, SS Choudhary, LB Roy, MR Kaloop… - Case Studies in …, 2023 - Elsevier
Soil is a heterogeneous medium, the characteristics that determine soil slope stability are
highly variable, making the analysis a difficult task. The present research approach is …

Surgical-vqla: Transformer with gated vision-language embedding for visual question localized-answering in robotic surgery

L Bai, M Islam, L Seenivasan… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Despite the availability of computer-aided simulators and recorded videos of surgical
procedures, junior residents still heavily rely on experts to answer their queries. However …

An efficient Intra-Inter pixel encryption scheme to secure healthcare images for an IoT environment

S Dash, S Padhy, SA Devi, S Sachi… - Expert Systems with …, 2023 - Elsevier
Digital images are being frequently used for diagnosis in clinics today. Diagnostic images
with identifying patient data are stored and transmitted across open networks. Security …

Image captioning improved visual question answering

H Sharma, AS Jalal - Multimedia tools and applications, 2022 - Springer
Abstract Both Visual Question Answering (VQA) and image captioning are the problems
which involve Computer Vision (CV) and Natural Language Processing (NLP) domains. In …