An improved attention and hybrid optimization technique for visual question answering

FS Gharehchopogh, S Ghafouri, M Namazi… - Journal of Bionic …, 2024 - Springer

This paper comprehensively analyzes the Manta Ray Foraging Optimization (MRFO)
algorithm and its integration into diverse academic fields. Introduced in 2020, the MRFO …

被引用次数：15 相关文章所有 3 个版本

Graph neural networks for visual question answering: a systematic review

AA Yusuf, C Feng, X Mao, R Ally Duma… - Multimedia Tools and …, 2024 - Springer

Recently, visual question answering (VQA) has gained considerable interest within the
computer vision and natural language processing (NLP) research areas. The VQA task …

被引用次数：2 相关文章

Multilevel attention and relation network based image captioning model

H Sharma, S Srivastava - Multimedia Tools and Applications, 2023 - Springer

The aim of the image captioning task is to understand various semantic concepts such as
objects and their relationships in an image and combine them to generate a natural …

被引用次数：12 相关文章所有 4 个版本

[PDF] ieee.org

A hybrid improved manta ray foraging optimization with Tabu search algorithm for solving the UAV placement problem in smart cities

AA Saadi, A Soukane, Y Meraihi, AB Gabis… - IEEE …, 2023 - ieeexplore.ieee.org

The concept of smart cities is to enhance the life quality of residents and provide efficient
services by integrating advanced information and communication technologies, autonomous …

被引用次数：7 相关文章所有 12 个版本

Improving visual question answering by combining scene-text information

H Sharma, AS Jalal - Multimedia Tools and Applications, 2022 - Springer

The text present in natural scenes contains semantic information about its surrounding
environment. For example, the majority of questions asked by blind people related to images …

被引用次数：13 相关文章所有 4 个版本

Graph neural network-based visual relationship and multilevel attention for image captioning

H Sharma, S Srivastava - Journal of Electronic Imaging, 2022 - spiedigitallibrary.org

With the remarkable success of the image captioning tasks, visual attention methods have
become a vital part of captioning models. However, most attention-based image captioning …

被引用次数：8 相关文章所有 3 个版本

A framework for image captioning based on relation network and multilevel attention mechanism

H Sharma, S Srivastava - Neural Processing Letters, 2023 - Springer

Understanding different semantic concepts, such as objects and their relationships in an
image, and integrating them to produce a natural language description is the goal of the …

被引用次数：2 相关文章所有 2 个版本

Image Captioning based on Deep Convolutional Neural Networks and LSTM

S Srivastava, H Sharma, P Dixit - 2022 2nd International …, 2022 - ieeexplore.ieee.org

Image captioning is a challenging task that needs the knowledge from both computer vision
algorithms and language processing techniques. The model must be able to understand an …

被引用次数：5 相关文章

[PDF] aclanthology.org

Dual capsule attention mask network with mutual learning for visual question answering

W Tian, H Li, ZQ Zhao - … of the 29th International Conference on …, 2022 - aclanthology.org

Abstract A Visual Question Answering (VQA) model processes images and questions
simultaneously with rich semantic information. The attention mechanism can highlight fine …

被引用次数：3 相关文章所有 2 个版本

Visual question answering model based on the fusion of multimodal features by a two-way co-attention mechanism

H Sharma, S Srivastava - The Imaging Science Journal, 2021 - Taylor & Francis

ABSTRACT Scene Text Visual Question Answering (VQA) needs to understand both the
visual contents and the texts in an image to predict an answer for the image-related …

被引用次数：6 相关文章所有 2 个版本

高级搜索

QQ 群