Semi-autoregressive transformer for image captioning

Y Zhou, Y Zhang, Z Hu, M Wang - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
captioning models. We evaluate SATIC model on the challenging MSCOCO [3] image
captioning … We present three examples of generated image captions in Figure 3. From the top …

Image Explorer: Multi-layered touch exploration to make images accessible

J Lee, YH Peng, J Herskovitz, A Guo - Proceedings of the 23rd …, 2021 - dl.acm.org
image alt-text, they may be of little use as user-written alttext is … auto-generated image
captions, we present Image Explorer, a touch-based image exploration system that divides image

Beyond a pre-trained object detector: Cross-modal textual and visual context for image captioning

CW Kuo, Z Kira - Proceedings of the IEEE/CVF conference …, 2022 - openaccess.thecvf.com
… richness to allow the captioning model to properly ground … image, and show qualitatively
and quantitatively that this can improve grounding. We validate our method on image captioning

Turkish Image Captioning with Vision Transformer Based Encoders and Text Decoders

S Yıldız, A Memiş, S Varlı - 2024 32nd Signal Processing and …, 2024 - ieeexplore.ieee.org
image captions are generated via a text decoder block. To test the performance of the Turkish
image captioning … a benchmark dataset consisting of Turkish image captions, was used. In …

Multi-gru based automated image captioning for smartphones

R Keskin, ÖT Moral, V Kılıç… - 2021 29th Signal …, 2021 - ieeexplore.ieee.org
… Abstract—Image captioning is the description of an image with natural language … of many
image captioning applications. In this study, a novel automatic image captioning system based …

Fusecap: Leveraging large language models for enriched fused image captions

N Rotstein, D Bensaïd, S Brody… - Proceedings of the …, 2024 - openaccess.thecvf.com
… models for image captioning. However, these models frequently produce generic captions
and may … , we also assess the captions through image-to-text and text-to-image retrieval tasks. …

Convolutional image captioning

J Aneja, A Deshpande… - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
… Secondly, as we will show in our results for image captioning, RNNs tend to produce
lower … Next, we describe an alternative convolutional approach to image captioning which …

A review of transformer-based approaches for image captioning

O Ondeng, H Ouma, P Akuon - Applied Sciences, 2023 - mdpi.com
… In this paper, we review a number of transformer-based image captioning models leading
up to the current state-of-the-art. Other reviews of image captioning models have been …

Iconographic image captioning for artworks

E Cetinic - … ICPR International Workshops and Challenges: Virtual …, 2021 - Springer
… not structured primarily as an image captioning dataset, each … on the down-stream task of
image captioning [42]. Transformer-… of artwork images with the goal to generate image captions

Entity-Aware Multimodal Alignment Framework for News Image Captioning

J Zhang, H Zhang, X Wan - arXiv preprint arXiv:2402.19404, 2024 - arxiv.org
… Therefore, we attempt to create a specialized image-text matching task within the news
image captioning task to align vision features with entity-aware textual features. Furthermore, …