alt text automatic image captioning- 学术资源搜索

Multi-modal image captioning for the visually impaired

H Ahsan, N Bhalla, D Bhatt, K Shah - arXiv preprint arXiv:2105.08106, 2021 - arxiv.org

… an image captioning model for the blind that specifically leverages text detected in the image.
2… -generator mechanism when generating captions to copy the detected text when needed. …

被引用次数：38 相关文章所有 5 个版本

[PDF] arxiv.org

Deep learning approaches on image captioning: A review

T Ghandi, H Pourreza, H Mahyar - ACM Computing Surveys, 2023 - dl.acm.org

… image" captioning methods. In this paper, we discuss various methods of image captioning
… most common problems and challenges of image captioning. We provide a comprehensive …

被引用次数：46 相关文章所有 5 个版本

[PDF] arxiv.org

Informative image captioning with external sources of information

S Zhao, P Sharma, T Levinboim, R Soricut - arXiv preprint arXiv …, 2019 - arxiv.org

… We present an image captioning model that combines image features with fine-grained
entities and object labels, and learns to produce fluent and informative image captions. …

被引用次数：42 相关文章所有 5 个版本

[PDF] arxiv.org

From show to tell: A survey on deep learning-based image captioning

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

… in image captioning has not reached a conclusive answer yet. This work aims at providing
a comprehensive overview of image captioning approaches, from visual encoding and text …

被引用次数：298 相关文章所有 11 个版本

[PDF] thecvf.com

Noise-aware learning from web-crawled image-text data for image captioning

W Kang, J Mun, S Lee, B Roh - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

… learning and DALL·E [40] for the text-to-image generation task. This is mainly thanks to the
… described in alt-texts of web-crawled data. Inspired by this, research on image captioning is …

被引用次数：9 相关文章所有 5 个版本

[PDF] arxiv.org

Re-evaluating automatic metrics for image captioning

M Kilickaya, A Erdem, N Ikizler-Cinbis… - arXiv preprint arXiv …, 2016 - arxiv.org

… In this section, we evaluate the robustness of the automatic image captioning metrics. For
this purpose, we employ the binary (two-alternative) forced choice task introduced in (Hodosh …

被引用次数：201 相关文章所有 10 个版本

Wataa: Web alternative text authoring assistant for improving web content accessibility

H Jeong, M Chun, H Lee, SY Oh, H Jung - Companion proceedings of …, 2023 - dl.acm.org

… the user enters a web page URL, and the alt text checker identifies any images without
alt text. WATAA then uses an image captioning model to generate automatic alt text for each …

被引用次数：8 相关文章所有 2 个版本

[PDF] thecvf.com

The unreasonable effectiveness of CLIP features for image captioning: an experimental analysis

M Barraco, M Cornia, S Cascianelli… - proceedings of the …, 2022 - openaccess.thecvf.com

… To assess the role of visual features extracted from CLIPlike models in image captioning, …
features in standard and more challenging image captioning settings. We use the commonly …

被引用次数：64 相关文章所有 6 个版本

[PDF] arxiv.org

Xgpt: Cross-modal generative pre-training for image captioning

Q Xia, H Huang, N Duan, D Zhang, L Ji, Z Sui… - … Processing and Chinese …, 2021 - Springer

… benchmark datasets, including COCO Captions and Flickr30k Captions. We also use XGPT
to generate image captions as data augmentation for the image retrieval task and achieve …

被引用次数：71 相关文章所有 7 个版本

[PDF] arxiv.org

Clipcap: Clip prefix for image captioning

R Mokady, A Hertz, AH Bermano - arXiv preprint arXiv:2111.09734, 2021 - arxiv.org

… auxiliary text, such as generating or editing an image … image captioning. Note that our method
does not employ the CLIP’s textual encoder, since there is no input text, and the output text …

被引用次数：583 相关文章所有 2 个版本

高级搜索

QQ 群

Multi-modal image captioning for the visually impaired

Deep learning approaches on image captioning: A review

Informative image captioning with external sources of information

From show to tell: A survey on deep learning-based image captioning

Noise-aware learning from web-crawled image-text data for image captioning

Re-evaluating automatic metrics for image captioning

Wataa: Web alternative text authoring assistant for improving web content accessibility

The unreasonable effectiveness of CLIP features for image captioning: an experimental analysis

Xgpt: Cross-modal generative pre-training for image captioning

Clipcap: Clip prefix for image captioning

引用