From show to tell: A survey on deep learning-based image captioning

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

Vision-language intelligence: Tasks, representation learning, and large models

F Li, H Zhang, YF Zhang, S Liu, J Guo, LM Ni… - arXiv preprint arXiv …, 2022 - arxiv.org
This paper presents a comprehensive survey of vision-language (VL) intelligence from the
perspective of time. This survey is inspired by the remarkable progress in both computer …

A comprehensive literature review on image captioning methods and metrics based on deep learning technique

AS Al-Shamayleh, O Adwan, MA Alsharaiah… - Multimedia Tools and …, 2024 - Springer
One of the trending areas of study in artificial intelligence is image captioning. Image
captioning is a process of creating descriptive information for visual objects, image …

[PDF][PDF] Universal captioner: Long-tail vision-and-language model training through content-style separation

M Cornia, L Baraldi, G Fiameni… - arXiv preprint arXiv …, 2021 - researchgate.net
While captioning models have obtained compelling results in describing natural images,
they still do not cover the entire long-tail distribution of real-world concepts. In this paper, we …

Image understanding by captioning with differentiable architecture search

R Hosseini, P Xie - Proceedings of the 30th ACM international …, 2022 - dl.acm.org
In deep learning applications, image understanding is a crucial task, where several
techniques such as image captioning and visual question answering have been widely …

Image captioning via proximal policy optimization

L Zhang, Y Zhang, X Zhao, Z Zou - Image and Vision Computing, 2021 - Elsevier
Image captioning is the task of generating captions of images in natural language. The
training typically consists of two phases, first minimizing the XE (cross-entropy) loss, and …

Image captioning using deep learning

CS Kanimozhiselvi, V Karthika… - 2022 International …, 2022 - ieeexplore.ieee.org
The process of generating a textual description for images is known as image captioning.
Now a days it is one of the recent and growing research problem. Day by day various …

Knowledge Acquisition for Human-In-The-Loop Image Captioning

E Zheng, Q Yu, R Li, P Shi… - … Conference on Artificial …, 2023 - proceedings.mlr.press
Image captioning offers a computational process to understand the semantics of images and
convey them using descriptive language. However, automated captioning models may not …

ReverseGAN: An intelligent reverse generative adversarial networks system for complex image captioning generation

G Tong, W Shao, Y Li - Displays, 2024 - Elsevier
Towards the inclusion of complex semantic relational images, we propose an intelligent
Reverse Generative Adversarial Network (ReverseGAN) with generative task guidance to …

Eaes: Effective augmented embedding spaces for text-based image captioning

K Nguyen, DC Bui, T Trinh, ND Vo - IEEE Access, 2022 - ieeexplore.ieee.org
Text-based Image Captioning has been a novel problem since 2020. This topic remains
challenging because it requires the model to comprehend not only the visual context but …