相关文章- 学术资源搜索

Vistext: A benchmark for semantically rich chart captioning

BJ Tang, A Boggust, A Satyanarayan - arXiv preprint arXiv:2307.05356, 2023 - arxiv.org

Captions that describe or explain charts help improve recall and comprehension of the
depicted data and provide a more accessible medium for people with visual disabilities …

被引用次数：30 相关文章所有 10 个版本

[PDF] joelchan.me

Generating accurate caption units for figure captioning

X Qian, E Koh, F Du, S Kim, J Chan, RA Rossi… - Proceedings of the Web …, 2021 - dl.acm.org

Scientific-style figures are commonly used on the web to present numerical information.
Captions that tell accurate figure information and sound natural would significantly improve …

被引用次数：33 相关文章所有 6 个版本

[PDF] arxiv.org

Do lvlms understand charts? analyzing and correcting factual errors in chart captioning

KH Huang, M Zhou, HP Chan, YR Fung… - arXiv preprint arXiv …, 2023 - arxiv.org

Recent advancements in large vision-language models (LVLMs) have led to significant
progress in generating natural language descriptions for visual content and thus enhancing …

被引用次数：12 相关文章所有 2 个版本

[PDF] arxiv.org

Clipcap: Clip prefix for image captioning

R Mokady, A Hertz, AH Bermano - arXiv preprint arXiv:2111.09734, 2021 - arxiv.org

Image captioning is a fundamental task in vision-language understanding, where the model
predicts a textual informative caption to a given input image. In this paper, we present a …

被引用次数：573 相关文章所有 2 个版本

[PDF] thecvf.com

A picture is worth more than 77 text tokens: Evaluating clip-style models on dense captions

J Urbanek, F Bordes, P Astolfi… - Proceedings of the …, 2024 - openaccess.thecvf.com

Curation methods for massive vision-language datasets trade off between dataset size and
quality. However even the highest quality of available curated captions are far too short to …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Chart-to-text: Generating natural language descriptions for charts by adapting the transformer model

J Obeid, E Hoque - arXiv preprint arXiv:2010.09142, 2020 - arxiv.org

Information visualizations such as bar charts and line charts are very popular for exploring
data and communicating insights. Interpreting and making sense of such visualizations can …

被引用次数：97 相关文章所有 4 个版本

[PDF] aclanthology.org

Audiocaps: Generating captions for audios in the wild

CD Kim, B Kim, H Lee, G Kim - … of the 2019 Conference of the …, 2019 - aclanthology.org

We explore the problem of Audio Captioning: generating natural language description for
any kind of audio in the wild, which has been surprisingly unexplored in previous research …

被引用次数：380 相关文章

[PDF] thecvf.com

Guiding image captioning models toward more specific captions

S Kornblith, L Li, Z Wang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Image captioning is conventionally formulated as the task of generating captions that match
the conditional distribution of reference image-caption pairs. However, reference captions in …

被引用次数：4 相关文章所有 5 个版本

[PDF] arxiv.org

Chart-to-text: A large-scale benchmark for chart summarization

S Kantharaj, RTK Leong, X Lin, A Masry… - arXiv preprint arXiv …, 2022 - arxiv.org

Charts are commonly used for exploring data and communicating insights. Generating
natural language summaries from charts can be very helpful for people in inferring key …

被引用次数：78 相关文章所有 5 个版本

[PDF] arxiv.org

Clipscore: A reference-free evaluation metric for image captioning

J Hessel, A Holtzman, M Forbes, RL Bras… - arXiv preprint arXiv …, 2021 - arxiv.org

Image captioning has conventionally relied on reference-based automatic evaluations,
where machine captions are compared against captions written by humans. This is in …

被引用次数：728 相关文章所有 5 个版本

高级搜索

QQ 群

Vistext: A benchmark for semantically rich chart captioning

Generating accurate caption units for figure captioning

Do lvlms understand charts? analyzing and correcting factual errors in chart captioning

Clipcap: Clip prefix for image captioning

A picture is worth more than 77 text tokens: Evaluating clip-style models on dense captions

Chart-to-text: Generating natural language descriptions for charts by adapting the transformer model

Audiocaps: Generating captions for audios in the wild

Guiding image captioning models toward more specific captions

Chart-to-text: A large-scale benchmark for chart summarization

Clipscore: A reference-free evaluation metric for image captioning

相关搜索

引用