Spatio-temporal memory attention for image captioning

J Ji, C Xu, X Zhang, B Wang… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Visual attention has been successfully applied in image captioning to selectively incorporate
the most relevant areas to the language generation procedure. However, the attention in …

[PDF][PDF] Show, Observe and Tell: Attribute-driven Attention Model for Image Captioning.

H Chen, G Ding, Z Lin, S Zhao, J Han - IJCAI, 2018 - ijcai.org
Attribute-based approaches and attention-based approaches have been proven to be
effective in image captioning. However, most attribute-based approaches simply predict …

Look back and predict forward in image captioning

Y Qin, J Du, Y Zhang, H Lu - … of the IEEE/CVF conference on …, 2019 - openaccess.thecvf.com
Most existing attention-based methods on image captioning focus on the current word and
visual information in one time step and generate the next word, without considering the …

Aligning linguistic words and visual semantic units for image captioning

L Guo, J Liu, J Tang, J Li, W Luo, H Lu - Proceedings of the 27th ACM …, 2019 - dl.acm.org
Image captioning attempts to generate a sentence composed of several linguistic words,
which are used to describe objects, attributes, and interactions in an image, denoted as …

Dual attention on pyramid feature maps for image captioning

L Yu, J Zhang, Q Wu - IEEE Transactions on Multimedia, 2021 - ieeexplore.ieee.org
Generating natural sentences from images is a fundamental learning task for visual-
semantic understanding in multimedia. In this paper, we propose to apply dual attention on …

Task-adaptive attention for image captioning

C Yan, Y Hao, L Li, J Yin, A Liu, Z Mao… - … on Circuits and …, 2021 - ieeexplore.ieee.org
Attention mechanisms are now widely used in image captioning models. However, most
attention models only focus on visual features. When generating syntax related words, little …

A new attention-based LSTM for image captioning

F Xiao, W Xue, Y Shen, X Gao - Neural Processing Letters, 2022 - Springer
Image captioning aims to describe the content of an image with a complete and natural
sentence. Recently, the image captioning methods with encoder-decoder architecture has …

Divergent-convergent attention for image captioning

J Ji, Z Du, X Zhang - Pattern Recognition, 2021 - Elsevier
Attention mechanism has made great progress in image captioning, where semantic words
or local regions are selectively embedded into the language model. However, current …

The synergy of double attention: Combine sentence-level and word-level attention for image captioning

H Wei, Z Li, C Zhang, H Ma - Computer Vision and Image Understanding, 2020 - Elsevier
The existing attention models of image captioning typically extract only word-level attention
information, ie, the attention mechanism extracts local attention information from the image …

Bi-directional co-attention network for image captioning

W Jiang, W Wang, H Hu - ACM Transactions on Multimedia Computing …, 2021 - dl.acm.org
Image Captioning, which automatically describes an image with natural language, is
regarded as a fundamental challenge in computer vision. In recent years, significant …