This research proposes a distinctive deep learning network architecture for image captioning and description generation. Specifically, we propose a hierarchically trained …
Y Zheng, Y Li, S Wang - … of the IEEE/CVF conference on …, 2019 - openaccess.thecvf.com
Although existing image caption models can produce promising results using recurrent neural networks (RNNs), it is difficult to guarantee that an object we care about is contained …
L Yang, H Wang, P Tang, Q Li - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Image captioning is a challenging task of visual understanding and has drawn more attention of researchers. In general, two inputs are required at each time step by the Long …
Q Wang, AB Chan - … of the IEEE/CVF Conference on …, 2019 - openaccess.thecvf.com
Recently, the state-of-the-art models for image captioning have overtaken human performance based on the most popular metrics, such as BLEU, METEOR, ROUGE and …
Automatically generating a natural language description of an image has attracted interests recently both because of its importance in practical applications and because it connects two …
A Karpathy, L Fei-Fei - Proceedings of the IEEE conference on …, 2015 - cv-foundation.org
We present a model that generates natural language descriptions of images and their regions. Our approach leverages datasets of images and their sentence descriptions to …
Y Mao, L Chen, Z Jiang, D Zhang, Z Zhang… - Proceedings of the 30th …, 2022 - dl.acm.org
Distinctive Image Captioning (DIC)---generating distinctive captions that describe the unique details of a target image---has received considerable attention over the last few years. A …
Describing the content of an image is a challenging task. To enable detailed description, it requires the detection and recognition of objects, people, relationships and associated …
R Luo, B Price, S Cohen… - Proceedings of the …, 2018 - openaccess.thecvf.com
One property that remains lacking in image captions generated by contemporary methods is discriminability: being able to tell two images apart given the caption for one of them. We …