Region-aware image captioning via interaction learning

AA Liu, Y Zhai, N Xu, W Nie, W Li… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Image captioning is one of the primary goals in computer vision which aims to automatically
generate natural descriptions for images. Intuitively, human visual system can notice some …

Aligning linguistic words and visual semantic units for image captioning

L Guo, J Liu, J Tang, J Li, W Luo, H Lu - Proceedings of the 27th ACM …, 2019 - dl.acm.org
Image captioning attempts to generate a sentence composed of several linguistic words,
which are used to describe objects, attributes, and interactions in an image, denoted as …

Bi-directional co-attention network for image captioning

W Jiang, W Wang, H Hu - ACM Transactions on Multimedia Computing …, 2021 - dl.acm.org
Image Captioning, which automatically describes an image with natural language, is
regarded as a fundamental challenge in computer vision. In recent years, significant …

Context-aware visual policy network for sequence-level image captioning

D Liu, ZJ Zha, H Zhang, Y Zhang, F Wu - Proceedings of the 26th ACM …, 2018 - dl.acm.org
Many vision-language tasks can be reduced to the problem of sequence prediction for
natural language output. In particular, recent advances in image captioning use deep …

Know more say less: Image captioning based on scene graphs

X Li, S Jiang - IEEE Transactions on Multimedia, 2019 - ieeexplore.ieee.org
Automatically describing the content of an image has been attracting considerable research
attention in the multimedia field. To represent the content of an image, many approaches …

Difnet: Boosting visual information flow for image captioning

M Wu, X Zhang, X Sun, Y Zhou… - Proceedings of the …, 2022 - openaccess.thecvf.com
Current Image captioning (IC) methods predict textual words sequentially based on the input
visual information from the visual feature extractor and the partially generated sentence …

Spatio-temporal memory attention for image captioning

J Ji, C Xu, X Zhang, B Wang… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Visual attention has been successfully applied in image captioning to selectively incorporate
the most relevant areas to the language generation procedure. However, the attention in …

High-order interaction learning for image captioning

Y Wang, N Xu, AA Liu, W Li… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Image captioning aims at understanding various semantic concepts (eg, objects and
relationships) from an image and integrating them in a sentence-level description. Hence, it …

Task-adaptive attention for image captioning

C Yan, Y Hao, L Li, J Yin, A Liu, Z Mao… - … on Circuits and …, 2021 - ieeexplore.ieee.org
Attention mechanisms are now widely used in image captioning models. However, most
attention models only focus on visual features. When generating syntax related words, little …

[PDF][PDF] Show, Observe and Tell: Attribute-driven Attention Model for Image Captioning.

H Chen, G Ding, Z Lin, S Zhao, J Han - IJCAI, 2018 - ijcai.org
Attribute-based approaches and attention-based approaches have been proven to be
effective in image captioning. However, most attribute-based approaches simply predict …