AutoCaption: Image captioning with neural architecture search

X Zhu, W Wang, L Guo, J Liu - arXiv preprint arXiv:2012.09742, 2020 - arxiv.org
Image captioning transforms complex visual information into abstract natural language for
representation, which can help computers understanding the world quickly. However, due to …

Evolutionary recurrent neural network for image captioning

H Wang, H Wang, K Xu - Neurocomputing, 2020 - Elsevier
Automatic architecture search is efficient to discover novel neural networks while it is mostly
employed for pure vision or natural language tasks. However, cross-modality tasks are …

Image understanding by captioning with differentiable architecture search

R Hosseini, P Xie - Proceedings of the 30th ACM international …, 2022 - dl.acm.org
In deep learning applications, image understanding is a crucial task, where several
techniques such as image captioning and visual question answering have been widely …

Stack-captioning: Coarse-to-fine learning for image captioning

J Gu, J Cai, G Wang, T Chen - Proceedings of the AAAI conference on …, 2018 - ojs.aaai.org
The existing image captioning approaches typically train a one-stage sentence decoder,
which is difficult to generate rich fine-grained descriptions. On the other hand, multi-stage …

Attention beam: An image captioning approach

A Shrimal, T Chakraborty - arXiv preprint arXiv:2011.01753, 2020 - arxiv.org
The aim of image captioning is to generate textual description of a given image. Though
seemingly an easy task for humans, it is challenging for machines as it requires the ability to …

Revisiting knowledge distillation for image captioning

J Dong, Z Hu, Y Zhou - … : First CAAI International Conference, CICAI 2021 …, 2021 - Springer
Abstract Knowledge Distillation (KD)[6], as an effective technique for model compression
and improving a model's performance, has been widely studied and adopted. However …

Fast, diverse and accurate image captioning guided by part-of-speech

A Deshpande, J Aneja, L Wang… - Proceedings of the …, 2019 - openaccess.thecvf.com
Image captioning is an ambiguous problem, with many suitable captions for an image. To
address ambiguity, beam search is the de facto method for sampling multiple captions …

Show and tell: A neural image caption generator

O Vinyals, A Toshev, S Bengio… - Proceedings of the IEEE …, 2015 - cv-foundation.org
Automatically describing the content of an image is a fundamental problem in artificial
intelligence that connects computer vision and natural language processing. In this paper …

Bidirectional beam search: Forward-backward inference in neural sequence models for fill-in-the-blank image captioning

Q Sun, S Lee, D Batra - Proceedings of the IEEE …, 2017 - openaccess.thecvf.com
We develop the first approximate inference algorithm for 1-Best (and M-Best) decoding in
bidirectional neural sequence models by extending Beam Search (BS) to reason about both …

Caponimage: Context-driven dense-captioning on image

Y Gao, X Hou, Y Zhang, T Ge, Y Jiang… - arXiv preprint arXiv …, 2022 - arxiv.org
Existing image captioning systems are dedicated to generating narrative captions for
images, which are spatially detached from the image in presentation. However, texts can …