A visual attention grounding neural model for multimodal machine translation

S Uppal, S Bhagat, D Hazarika, N Majumder, S Poria… - Information …, 2022 - Elsevier

Deep Learning and its applications have cascaded impactful research and development
with a diverse range of modalities present in the real-world data. More recently, this has …

被引用次数：96 相关文章所有 5 个版本

[PDF] ieee.org

A systematic literature review on multimodal machine learning: Applications, challenges, gaps and future directions

A Barua, MU Ahmed, S Begum - IEEE Access, 2023 - ieeexplore.ieee.org

Multimodal machine learning (MML) is a tempting multidisciplinary research area where
heterogeneous data from multiple modalities and machine learning (ML) are combined to …

被引用次数：31 相关文章所有 4 个版本

[PDF] acm.org

Wit: Wikipedia-based image text dataset for multimodal multilingual machine learning

K Srinivasan, K Raman, J Chen, M Bendersky… - Proceedings of the 44th …, 2021 - dl.acm.org

The milestone improvements brought about by deep representation learning and pre-
training techniques have led to large performance gains across downstream NLP, IR and …

被引用次数：262 相关文章所有 8 个版本

[PDF] arxiv.org

A novel graph-based multi-modal fusion encoder for neural machine translation

Y Yin, F Meng, J Su, C Zhou, Z Yang, J Zhou… - arXiv preprint arXiv …, 2020 - arxiv.org

Multi-modal neural machine translation (NMT) aims to translate source sentences into a
target language paired with images. However, dominant multi-modal NMT models do not …

被引用次数：146 相关文章所有 5 个版本

[PDF] jair.org Full View

Trends in integration of vision and language research: A survey of tasks, datasets, and methods

A Mogadala, M Kalimuthu, D Klakow - Journal of Artificial Intelligence …, 2021 - jair.org

Abstract Interest in Artificial Intelligence (AI) and its applications has seen unprecedented
growth in the last few years. This success can be partly attributed to the advancements made …

被引用次数：147 相关文章所有 8 个版本

[PDF] thecvf.com

Uc2: Universal cross-lingual cross-modal vision-and-language pre-training

M Zhou, L Zhou, S Wang, Y Cheng… - Proceedings of the …, 2021 - openaccess.thecvf.com

Vision-and-language pre-training has achieved impressive success in learning multimodal
representations between vision and language. To generalize this success to non-English …

被引用次数：81 相关文章所有 9 个版本

[PDF] jair.org Full View

Neural natural language generation: A survey on multilinguality, multimodality, controllability and learning

E Erdem, M Kuyu, S Yagcioglu, A Frank… - Journal of Artificial …, 2022 - jair.org

Developing artificial learning systems that can understand and generate natural language
has been one of the long-standing goals of artificial intelligence. Recent decades have …

被引用次数：48 相关文章所有 20 个版本

[PDF] arxiv.org

Dynamic context-guided capsule network for multimodal machine translation

H Lin, F Meng, J Su, Y Yin, Z Yang, Y Ge… - Proceedings of the 28th …, 2020 - dl.acm.org

Multimodal machine translation (MMT), which mainly focuses on enhancing text-only
translation with visual features, has attracted considerable attention from both computer …

被引用次数：80 相关文章所有 4 个版本

[PDF] arxiv.org

Grounding'grounding'in NLP

KR Chandu, Y Bisk, AW Black - arXiv preprint arXiv:2106.02192, 2021 - arxiv.org

The NLP community has seen substantial recent interest in grounding to facilitate interaction
between language technologies and the world. However, as a community, we use the term …

被引用次数：58 相关文章所有 6 个版本

Multimodality information fusion for automated machine translation

L Li, T Tayir, Y Han, X Tao, JD Velásquez - Information Fusion, 2023 - Elsevier

Abstract Machine translation is a popular automation approach for translating texts between
different languages. Although traditionally it has a strong focus on natural language, images …

被引用次数：19 相关文章所有 3 个版本

高级搜索

QQ 群