Reference-based model using multimodal gated recurrent units for image captioning

[HTML][HTML] Exploring the frontiers of deep learning and natural language processing: A comprehensive overview of key challenges and emerging trends

W Khan, A Daud, K Khan, S Muhammad… - Natural Language …, 2023 - Elsevier

In the recent past, more than 5 years or so, DL especially the large language models (LLMs)
has generated extensive studies out of a distinctly average downturn field of knowledge …

被引用次数：48 相关文章

Image captioning with adaptive incremental global context attention

C Wang, X Gu - Applied Intelligence, 2022 - Springer

The encoder-decoder framework has proliferated in current image captioning task, where
the decoder generates target description word by word based on the preceding captions …

被引用次数：21 相关文章所有 3 个版本

Dynamic-balanced double-attention fusion for image captioning

C Wang, X Gu - Engineering Applications of Artificial Intelligence, 2022 - Elsevier

Image captioning has received significant attention in the cross-modal field in which spatial
and channel attentions play a crucial role. However, such attention-based approaches …

被引用次数：11 相关文章所有 2 个版本

[PDF] ieee.org

Sequential vision to language as story: A storytelling dataset and benchmarking

ZM Malakan, S Anwar, GM Hassan, A Mian - IEEE Access, 2023 - ieeexplore.ieee.org

Storytelling is a remarkable human skill that plays a significant role in learning and
experiencing everyday life. Developing narratives is central to human mental health …

被引用次数：3 相关文章所有 4 个版本

[PDF] acm.org

Towards Retrieval-Augmented Architectures for Image Captioning

S Sarto, M Cornia, L Baraldi, A Nicolosi… - ACM Transactions on …, 2024 - dl.acm.org

The objective of image captioning models is to bridge the gap between the visual and
linguistic modalities by generating natural language descriptions that accurately reflect the …

被引用次数：2 相关文章所有 5 个版本

Optimal transformers based image captioning using beam search

A Shetty, Y Kale, Y Patil, R Patil, S Sharma - Multimedia Tools and …, 2024 - Springer

Image Captioning is the process of generating textual descriptions of given images. It
encompasses two major fields of deep learning, computer vision, and natural language …

被引用次数：3 相关文章

[PDF] arxiv.org

Vision transformer based model for describing a set of images as a story

ZM Malakan, GM Hassan, A Mian - Australasian Joint Conference on …, 2022 - Springer

Abstract Visual Story-Telling is the process of forming a multi sentence story from a set of
images. Appropriately including visual variation and contextual information captured inside …

被引用次数：10 相关文章所有 7 个版本

A novel approach for suspicious activity detection with deep learning

N Dwivedi, DK Singh, DS Kushwaha - Multimedia Tools and Applications, 2023 - Springer

Suspicious human activities like fighting, shooting, fire have got serious security concern in
public places because of a steep surge in these types of cases all around. CCTV cameras …

被引用次数：6 相关文章所有 3 个版本

[PDF] mdpi.com

Image caption generation using multi-level semantic context information

P Tian, H Mo, L Jiang - Symmetry, 2021 - mdpi.com

Object detection, visual relationship detection, and image captioning, which are the three
main visual tasks in scene understanding, are highly correlated and correspond to different …

被引用次数：11 相关文章所有 5 个版本

Collaborative strategy network for spatial attention image captioning

D Zhou, J Yang, R Bao - Applied Intelligence, 2022 - Springer

Automatic image captioning is an interesting task that lies at the intersection of computer
vision and natural language processing. Although image captioning based on reinforcement …

被引用次数：7 相关文章所有 3 个版本

高级搜索

QQ 群