A framework for visual question answering with the integration of scene-text using PHOCs...

[HTML][HTML] Image captioning for effective use of language models in knowledge-based visual question answering

A Salaberria, G Azkune, OL de Lacalle, A Soroa… - Expert Systems with …, 2023 - Elsevier

Integrating outside knowledge for reasoning in visio-linguistic tasks such as visual question
answering (VQA) is an open problem. Given that pretrained language models have been …

被引用次数：41 相关文章所有 4 个版本

[PDF] ieee.org

An AI-based medical chatbot model for infectious disease prediction

S Chakraborty, H Paul, S Ghatak, SK Pandey… - Ieee …, 2022 - ieeexplore.ieee.org

The purpose of this paper is to show concisely how we can promote chatbots in the medical
sector and cure infectious diseases. We can create awareness through the users and the …

被引用次数：22 相关文章所有 5 个版本

Question-guided feature pyramid network for medical visual question answering

Y Yu, H Li, H Shi, L Li, J Xiao - Expert Systems with Applications, 2023 - Elsevier

Abstract Medical VQA (VQA-Med) is a critical multi-modal task that raises attention from the
community. Existing models utilized just one high-level feature map (ie, the last layer feature …

被引用次数：10 相关文章所有 2 个版本

Multilevel attention and relation network based image captioning model

H Sharma, S Srivastava - Multimedia Tools and Applications, 2023 - Springer

The aim of the image captioning task is to understand various semantic concepts such as
objects and their relationships in an image and combine them to generate a natural …

被引用次数：12 相关文章所有 4 个版本

Heterogeneous Graph Fusion Network for cross-modal image-text retrieval

X Qin, L Li, G Pang, F Hao - Expert Systems with Applications, 2024 - Elsevier

Exploring the semantic correspondence of image-text pairs is significant as it bridges vision
and language. Most prior works focus on global semantic alignment or local semantic …

被引用次数：2 相关文章

[HTML] rsc.org

[HTML][HTML] Effect of sulphidation process on the structure, morphology and optical properties of GO/AgNWs composites

MB Baghirov, M Muradov, G Eyvazova… - RSC …, 2024 - pubs.rsc.org

In this study, composite materials composed of graphene oxide (GO) synthesized by a
modified Hummers' method and silver nanowires (AgNWs) synthesized by a modified polyol …

被引用次数：1 相关文章所有 7 个版本

[HTML] rsc.org

[HTML][HTML] Novel gamma-irradiated chitosan-doped reduced graphene-CuInS 2 composites as counter electrodes for dye-sensitized solar cells

Y Areerob, C Hamontree, P Sricharoen… - RSC …, 2022 - pubs.rsc.org

To address the issues associated with traditional counter electrodes, a novel gamma-
irradiated chitosan-doped reduced graphene-CuInS2 composite (Chi@ RGO-CIS) was used …

被引用次数：9 相关文章所有 10 个版本

Graph neural network-based visual relationship and multilevel attention for image captioning

H Sharma, S Srivastava - Journal of Electronic Imaging, 2022 - spiedigitallibrary.org

With the remarkable success of the image captioning tasks, visual attention methods have
become a vital part of captioning models. However, most attention-based image captioning …

被引用次数：8 相关文章所有 3 个版本

A framework for image captioning based on relation network and multilevel attention mechanism

H Sharma, S Srivastava - Neural Processing Letters, 2023 - Springer

Understanding different semantic concepts, such as objects and their relationships in an
image, and integrating them to produce a natural language description is the goal of the …

被引用次数：2 相关文章所有 2 个版本

Image Captioning based on Deep Convolutional Neural Networks and LSTM

S Srivastava, H Sharma, P Dixit - 2022 2nd International …, 2022 - ieeexplore.ieee.org

Image captioning is a challenging task that needs the knowledge from both computer vision
algorithms and language processing techniques. The model must be able to understand an …

被引用次数：5 相关文章

高级搜索

QQ 群