Automatic alt-text: Computer-generated image descriptions for blind users on a social network...

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

被引用次数：291 相关文章所有 11 个版本

[PDF] eg.org

Accessible visualization: Design space, opportunities, and challenges

NW Kim, SC Joyner, A Riegelhuth… - Computer graphics …, 2021 - Wiley Online Library

Visualizations are now widely used across disciplines to understand and communicate data.
The benefit of visualizations lies in leveraging our natural visual perception. However, the …

被引用次数：94 相关文章所有 5 个版本

[PDF] arxiv.org

Ai-generated content (aigc): A survey

J Wu, W Gan, Z Chen, S Wan, H Lin - arXiv preprint arXiv:2304.06632, 2023 - arxiv.org

To address the challenges of digital intelligence in the digital economy, artificial intelligence-
generated content (AIGC) has emerged. AIGC uses artificial intelligence to assist or replace …

被引用次数：104 相关文章所有 3 个版本

[PDF] thecvf.com

Vizwiz grand challenge: Answering visual questions from blind people

D Gurari, Q Li, AJ Stangl, A Guo, C Lin… - Proceedings of the …, 2018 - openaccess.thecvf.com

The study of algorithms to automatically answer visual questions currently is motivated by
visual question answering (VQA) datasets constructed in artificial VQA settings. We propose …

被引用次数：610 相关文章所有 14 个版本

Region-aware image captioning via interaction learning

AA Liu, Y Zhai, N Xu, W Nie, W Li… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org

Image captioning is one of the primary goals in computer vision which aims to automatically
generate natural descriptions for images. Intuitively, human visual system can notice some …

被引用次数：105 相关文章

[PDF] thecvf.com

Improved image captioning via policy gradient optimization of spider

S Liu, Z Zhu, N Ye, S Guadarrama… - Proceedings of the …, 2017 - openaccess.thecvf.com

Current image captioning methods are usually trained via maximum likelihood estimation.
However, the log-likelihood score of a caption does not correlate well with human …

被引用次数：500 相关文章所有 6 个版本

[PDF] arxiv.org

Connecting vision and language with localized narratives

J Pont-Tuset, J Uijlings, S Changpinyo… - Computer Vision–ECCV …, 2020 - Springer

Abstract We propose Localized Narratives, a new form of multimodal image annotations
connecting vision and language. We ask annotators to describe an image with their voice …

被引用次数：212 相关文章所有 7 个版本

[PDF] arxiv.org

Captioning images taken by people who are blind

D Gurari, Y Zhao, M Zhang, N Bhattacharya - Computer Vision–ECCV …, 2020 - Springer

While an important problem in the vision community is to design algorithms that can
automatically caption images, few publicly-available datasets for algorithm development …

被引用次数：187 相关文章所有 7 个版本

[PDF] acm.org

Understanding the effect of out-of-distribution examples and interactive explanations on human-ai decision making

H Liu, V Lai, C Tan - Proceedings of the ACM on Human-Computer …, 2021 - dl.acm.org

Although AI holds promise for improving human decision making in societally critical
domains, it remains an open question how human-AI teams can reliably outperform AI alone …

被引用次数：109 相关文章所有 5 个版本

[PDF] aaai.org

Taxonomizing and measuring representational harms: A look at image tagging

J Katzman, A Wang, M Scheuerman… - Proceedings of the …, 2023 - ojs.aaai.org

In this paper, we examine computational approaches for measuring the" fairness" of image
tagging systems, finding that they cluster into five distinct categories, each with its own …

被引用次数：26 相关文章所有 5 个版本

高级搜索

QQ 群