Fast image caption generation with position alignment

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

被引用次数：317 相关文章所有 11 个版本

Deep image captioning: A review of methods, trends and future challenges

L Xu, Q Tang, J Lv, B Zheng, X Zeng, W Li - Neurocomputing, 2023 - Elsevier

Image captioning, also called report generation in medical field, aims to describe visual
content of images in human language, which requires to model semantic relationship …

被引用次数：23 相关文章所有 2 个版本

[PDF] arxiv.org

A survey on non-autoregressive generation for neural machine translation and beyond

Y Xiao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

被引用次数：78 相关文章所有 8 个版本

[PDF] thecvf.com

Deecap: Dynamic early exiting for efficient image captioning

Z Fei, X Yan, S Wang, Q Tian - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Both accuracy and efficiency are crucial for image captioning in real-world scenarios.
Although Transformer-based models have gained significant improved captioning …

被引用次数：38 相关文章所有 4 个版本

[PDF] aaai.org

Attention-aligned transformer for image captioning

Z Fei - proceedings of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org

Recently, attention-based image captioning models, which are expected to ground correct
image regions for proper word generations, have achieved remarkable performance …

被引用次数：35 相关文章所有 4 个版本

[PDF] thecvf.com

Semi-autoregressive transformer for image captioning

Y Zhou, Y Zhang, Z Hu, M Wang - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

Current state-of-the-art image captioning models adopt autoregressive decoders, ie they
generate each word by conditioning on previously generated words, which leads to heavy …

被引用次数：35 相关文章所有 7 个版本

[PDF] arxiv.org

Non-autoregressive image captioning with counterfactuals-critical multi-agent learning

L Guo, J Liu, X Zhu, X He, J Jiang, H Lu - arXiv preprint arXiv:2005.04690, 2020 - arxiv.org

Most image captioning models are autoregressive, ie they generate each word by
conditioning on previously generated words, which leads to heavy latency during inference …

被引用次数：53 相关文章所有 5 个版本

[PDF] aaai.org

Uncertainty-aware image captioning

Z Fei, M Fan, L Zhu, J Huang, X Wei… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

It is well believed that the higher uncertainty in a word of the caption, the more inter-
correlated context information is required to determine it. However, current image captioning …

被引用次数：10 相关文章所有 4 个版本

[PDF] arxiv.org

Diffusion-rwkv: Scaling rwkv-like architectures for diffusion models

Z Fei, M Fan, C Yu, D Li, J Huang - arXiv preprint arXiv:2404.04478, 2024 - arxiv.org

Transformers have catalyzed advancements in computer vision and natural language
processing (NLP) fields. However, substantial computational complexity poses limitations for …

被引用次数：7 相关文章所有 2 个版本

[PDF] aaai.org

Memory-augmented image captioning

Z Fei - Proceedings of the AAAI Conference on Artificial …, 2021 - ojs.aaai.org

Current deep learning-based image captioning systems have been proven to store practical
knowledge with their parameters and achieve competitive performances in the public …

被引用次数：30 相关文章所有 3 个版本

高级搜索

QQ 群