Image captioning: Transforming objects into words

S Herdade, A Kappeler, K Boakye… - Advances in neural …, 2019 - proceedings.neurips.cc
… the use of object spatial relationship modeling for image captioning, specifically within the
… For image captioning, our architecture uses the feature vectors from the object detector as …

Image captioning with object detection and localization

Z Yang, YJ Zhang, S Rehman, Y Huang - Image and Graphics: 9th …, 2017 - Springer
… of objects in an image. Due to the aforementioned reason, we combine object detection with
image captioning to … Initially, we use an object detection model to detect objects in the image

Beyond a pre-trained object detector: Cross-modal textual and visual context for image captioning

CW Kuo, Z Kira - Proceedings of the IEEE/CVF conference …, 2022 - openaccess.thecvf.com
… In this paper, we validate our proposed method on the VL task of image captioning. By …
-trained object detector described above, our method improves one of the SoTA image captioning

Image captioning using motion-CNN with object detection

K Iwamura, JY Louhi Kasahara, A Moro, A Yamashita… - Sensors, 2021 - mdpi.com
object region, not all the motion features. Therefore, object detection that detects an object
region in an image … our proposed method improves image captioning performance. Previous …

Image captioning with unseen objects

B Demirel, RG Cinbis, N Ikizler-Cinbis - arXiv preprint arXiv:1908.00047, 2019 - arxiv.org
… purely as a language modeling problem and presume the availability of a pre-trained fully-supervised
object detector over all object classes of interest, to the best of our knowledge. …

Image captioning model using attention and object features to mimic human image understanding

MA Al-Malla, A Jafar, N Ghneim - Journal of Big Data, 2022 - Springer
… In order to take advantage of the image classification features and the object detection
features, we add this concatenation step, where we attach the output of the YOLOv4 subsystem as …

Capsal: Leveraging captioning to boost semantics for salient object detection

L Zhang, J Zhang, Z Lin, H Lu… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
… propose CapSal, a salient object detection framework that exploits image captioning to pro-
… Inspired by the success of these works, we propose to leverage image captioning as an …

Nocaps: Novel object captioning at scale

H Agrawal, K Desai, Y Wang, X Chen… - Proceedings of the …, 2019 - openaccess.thecvf.com
… Dataset Analysis In this section, we compare our nocaps benchmark to COCO Captions [5]
in terms of both image content and caption diversity. Based on ground-truth object detection

Image captioning through image transformer

S He, W Liao, HR Tavakoli, M Yang… - Proceedings of the …, 2020 - openaccess.thecvf.com
… Two stage attention models consists of bottom-up attention and top-down attention, where
bottom-up attention first uses object detection models to detect multiple informative regions in …

Open-vocabulary object detection using captions

A Zareian, KD Rosa, DH Hu… - Proceedings of the …, 2021 - openaccess.thecvf.com
object detectors using bounding box annotations for a limited set of object categories, as well
as image… problem of knowledge transfer from image-caption pretraining to object detection. …