Generation and comprehension of unambiguous object descriptions

J Mao, J Huang, A Toshev… - Proceedings of the …, 2016 - openaccess.thecvf.com
We propose a method that can generate an unambiguous description (known as a referring
expression) of a specific object or region in an image, and which can also comprehend or …

Intention oriented image captions with guiding objects

Y Zheng, Y Li, S Wang - … of the IEEE/CVF conference on …, 2019 - openaccess.thecvf.com
Although existing image caption models can produce promising results using recurrent
neural networks (RNNs), it is difficult to guarantee that an object we care about is contained …

Good news, everyone! context driven entity-aware captioning for news images

AF Biten, L Gomez, M Rusinol… - Proceedings of the …, 2019 - openaccess.thecvf.com
Current image captioning systems perform at a merely descriptive level, essentially
enumerating the objects in the scene and their relations. Humans, on the contrary, interpret …

Describing like humans: on diversity in image captioning

Q Wang, AB Chan - … of the IEEE/CVF Conference on …, 2019 - openaccess.thecvf.com
Recently, the state-of-the-art models for image captioning have overtaken human
performance based on the most popular metrics, such as BLEU, METEOR, ROUGE and …

Deep compositional captioning: Describing novel object categories without paired training data

LA Hendricks, S Venugopalan… - Proceedings of the …, 2016 - openaccess.thecvf.com
While recent deep neural network models have achieved promising results on the image
captioning task, they rely largely on the availability of corpora with paired image and …

Pointing novel objects in image captioning

Y Li, T Yao, Y Pan, H Chao… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Image captioning has received significant attention with remarkable improvements in recent
advances. Nevertheless, images in the wild encapsulate rich knowledge and cannot be …

Beyond a pre-trained object detector: Cross-modal textual and visual context for image captioning

CW Kuo, Z Kira - Proceedings of the IEEE/CVF conference …, 2022 - openaccess.thecvf.com
Significant progress has been made on visual captioning, largely relying on pre-trained
features and later fixed object detectors that serve as rich inputs to auto-regressive models …

Show and tell: Lessons learned from the 2015 mscoco image captioning challenge

O Vinyals, A Toshev, S Bengio… - IEEE transactions on …, 2016 - ieeexplore.ieee.org
Automatically describing the content of an image is a fundamental problem in artificial
intelligence that connects computer vision and natural language processing. In this paper …

Unsupervised image captioning

Y Feng, L Ma, W Liu, J Luo - Proceedings of the IEEE/CVF …, 2019 - openaccess.thecvf.com
Deep neural networks have achieved great successes on the image captioning task.
However, most of the existing models depend heavily on paired image-sentence datasets …

Rich image captioning in the wild

K Tran, X He, L Zhang, J Sun, C Carapcea… - Proceedings of the …, 2016 - cv-foundation.org
We present an image caption system that addresses new challenges of automatically
describing images in the wild. The challenges include generating high quality caption with …