[HTML][HTML] Exploring the frontiers of deep learning and natural language processing: A comprehensive overview of key challenges and emerging trends

W Khan, A Daud, K Khan, S Muhammad… - Natural Language …, 2023 - Elsevier
In the recent past, more than 5 years or so, DL especially the large language models (LLMs)
has generated extensive studies out of a distinctly average downturn field of knowledge …

Image captioning with adaptive incremental global context attention

C Wang, X Gu - Applied Intelligence, 2022 - Springer
The encoder-decoder framework has proliferated in current image captioning task, where
the decoder generates target description word by word based on the preceding captions …

Dynamic-balanced double-attention fusion for image captioning

C Wang, X Gu - Engineering Applications of Artificial Intelligence, 2022 - Elsevier
Image captioning has received significant attention in the cross-modal field in which spatial
and channel attentions play a crucial role. However, such attention-based approaches …

Sequential vision to language as story: A storytelling dataset and benchmarking

ZM Malakan, S Anwar, GM Hassan, A Mian - IEEE Access, 2023 - ieeexplore.ieee.org
Storytelling is a remarkable human skill that plays a significant role in learning and
experiencing everyday life. Developing narratives is central to human mental health …

Towards Retrieval-Augmented Architectures for Image Captioning

S Sarto, M Cornia, L Baraldi, A Nicolosi… - ACM Transactions on …, 2024 - dl.acm.org
The objective of image captioning models is to bridge the gap between the visual and
linguistic modalities by generating natural language descriptions that accurately reflect the …

Optimal transformers based image captioning using beam search

A Shetty, Y Kale, Y Patil, R Patil, S Sharma - Multimedia Tools and …, 2024 - Springer
Image Captioning is the process of generating textual descriptions of given images. It
encompasses two major fields of deep learning, computer vision, and natural language …

Vision transformer based model for describing a set of images as a story

ZM Malakan, GM Hassan, A Mian - Australasian Joint Conference on …, 2022 - Springer
Abstract Visual Story-Telling is the process of forming a multi sentence story from a set of
images. Appropriately including visual variation and contextual information captured inside …

A novel approach for suspicious activity detection with deep learning

N Dwivedi, DK Singh, DS Kushwaha - Multimedia Tools and Applications, 2023 - Springer
Suspicious human activities like fighting, shooting, fire have got serious security concern in
public places because of a steep surge in these types of cases all around. CCTV cameras …

Image caption generation using multi-level semantic context information

P Tian, H Mo, L Jiang - Symmetry, 2021 - mdpi.com
Object detection, visual relationship detection, and image captioning, which are the three
main visual tasks in scene understanding, are highly correlated and correspond to different …

Collaborative strategy network for spatial attention image captioning

D Zhou, J Yang, R Bao - Applied Intelligence, 2022 - Springer
Automatic image captioning is an interesting task that lies at the intersection of computer
vision and natural language processing. Although image captioning based on reinforcement …