retrieval augmentation image captioning- 学术资源搜索

Smallcap: lightweight image captioning prompted with retrieval augmentation

R Ramos, B Martins, D Elliott… - Proceedings of the …, 2023 - openaccess.thecvf.com

… SMALLCAP, an image captioning model, prompted with captions retrieved from an external
datastore of text, based on the input image. This formulation of image captioning enables a …

被引用次数：59 相关文章所有 5 个版本

[PDF] acm.org

Retrieval-augmented transformer for image captioning

S Sarto, M Cornia, L Baraldi, R Cucchiara - Proceedings of the 19th …, 2022 - dl.acm.org

… In this paper, we investigate the development of a retrieval component for image captioning.
… -augmented attention layer to predict tokens based on the past context and on text retrieved …

被引用次数：37 相关文章所有 5 个版本

[PDF] arxiv.org

Retrieval-augmented image captioning

R Ramos, D Elliott, B Martins - arXiv preprint arXiv:2302.08268, 2023 - arxiv.org

… Inspired by retrieval-augmented language generation and pretrained Vision and Language
(… to image captioning that generates sentences given the input image and a set of captions re…

被引用次数：20 相关文章所有 3 个版本

[PDF] aaai.org

Memory-augmented image captioning

Z Fei - Proceedings of the AAAI Conference on Artificial …, 2021 - ojs.aaai.org

… retrieval data set, which for image captioning can be up to billions of examples. To search
over this large memory bank rapidly, we adopt FAISS (Johnson, Douze, and Jegou 2019), an …

被引用次数：30 相关文章所有 3 个版本

[PDF] arxiv.org

Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning

W Li, J Li, R Ramos, R Tang, D Elliott - arXiv preprint arXiv:2406.02265, 2024 - arxiv.org

… We evaluate the robustness of a single retrievalaugmented image captioning model in
this study. Given variations in training process and model structures, the observed model …

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

J Li, DM Vo, A Sugimoto… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

… Large language models (LLMs)-based image captioning has the capability of describing …
retrievalaugmented image captioning method that prompts LLMs with object names retrieved …

被引用次数：7 相关文章所有 3 个版本

[HTML] mdpi.com

[HTML][HTML] Text augmentation using BERT for image captioning

V Atliha, D Šešok - Applied Sciences, 2020 - mdpi.com

… visually impaired people improve a human-computer interactions by introducing visual
concepts to a computer and create better features for information retrieval using images [2,3]. …

被引用次数：35 相关文章所有 7 个版本

[PDF] arxiv.org

Re-vilm: Retrieval-augmented visual language model for zero and few-shot image captioning

Z Yang, W Ping, Z Liu, V Korthikanti, W Nie… - arXiv preprint arXiv …, 2023 - arxiv.org

… image captioning benchmarks. We aim to demonstrate the superiority of our retrieval
augmentation … and relevance of generated captions through retrieving relevant knowledge from …

被引用次数：16 相关文章所有 7 个版本

Retrieval-enhanced adversarial training with dynamic memory-augmented attention for image paragraph captioning

C Xu, M Yang, X Ao, Y Shen, R Xu, J Tian - Knowledge-Based Systems, 2021 - Elsevier

… Concretely, RAMP treats the retrieved captions as reference captions to augment the … the
image captioning model (generator) to incorporate informative content in retrieved captions into …

被引用次数：15 相关文章所有 2 个版本

[PDF] thecvf.com

Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage

D Cioni, L Berlincioni, F Becattini… - Proceedings of the …, 2023 - openaccess.thecvf.com

… In particular, we explore the benefits of augmenting artwork datasets for image captioning.
To this end, we leverage both textual descriptions of the paintings and a diffusion model to …

被引用次数：3 相关文章所有 5 个版本

高级搜索

QQ 群

Smallcap: lightweight image captioning prompted with retrieval augmentation

Retrieval-augmented transformer for image captioning

Retrieval-augmented image captioning

Memory-augmented image captioning

Understanding Retrieval Robustness for Retrieval-Augmented Image Captioning

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

[HTML][HTML] Text augmentation using BERT for image captioning

Re-vilm: Retrieval-augmented visual language model for zero and few-shot image captioning

Retrieval-enhanced adversarial training with dynamic memory-augmented attention for image paragraph captioning

Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage

相关搜索

引用