Microsoft speech language translation (mslt) corpus: The iwslt 2016 release for english,...

MA Di Gangi, R Cattoni, L Bentivogli, M Negri… - Proceedings of the …, 2019 - cris.fbk.eu

Current research on spoken language translation (SLT) has to confront with the scarcity of
sizeable and publicly available training corpora. This problem hinders the adoption of neural …

被引用次数：372 相关文章所有 6 个版本

Must-c: A multilingual corpus for end-to-end speech translation

R Cattoni, MA Di Gangi, L Bentivogli, M Negri… - Computer speech & …, 2021 - Elsevier

End-to-end spoken language translation (SLT) has recently gained popularity thanks to the
advancement of sequence to sequence learning in its two parent tasks: automatic speech …

被引用次数：139 相关文章所有 2 个版本

[PDF] arxiv.org

Towards automatic face-to-face translation

P KR, R Mukhopadhyay, J Philip, A Jha… - Proceedings of the 27th …, 2019 - dl.acm.org

In light of the recent breakthroughs in automatic machine translation systems, we propose a
novel approach that we term as" Face-to-Face Translation". As today's digital communication …

被引用次数：178 相关文章所有 10 个版本

[PDF] springer.com

Multimodal machine translation through visuals and speech

U Sulubacak, O Caglayan, SA Grönroos, A Rouhe… - Machine …, 2020 - Springer

Multimodal machine translation involves drawing information from more than one modality,
based on the assumption that the additional modalities will contain useful alternative views …

被引用次数：81 相关文章所有 18 个版本

[PDF] arxiv.org

Augmenting librispeech with french translations: A multimodal corpus for direct speech translation evaluation

AC Kocabiyikoglu, L Besacier, O Kraif - arXiv preprint arXiv:1802.03142, 2018 - arxiv.org

Recent works in spoken language translation (SLT) have attempted to build end-to-end
speech-to-text translation without using source language transcription during learning or …

被引用次数：106 相关文章所有 8 个版本

End-to-End Speech-to-Text Translation: A Survey

N Sethiya, CK Maurya - arXiv preprint arXiv:2312.01053, 2023 - arxiv.org

Speech-to-text translation pertains to the task of converting speech signals in a language to
text in another language. It finds its application in various domains, such as hands-free …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Large-scale streaming end-to-end speech translation with neural transducers

J Xue, P Wang, J Li, M Post, Y Gaur - arXiv preprint arXiv:2204.05352, 2022 - arxiv.org

Neural transducers have been widely used in automatic speech recognition (ASR). In this
paper, we introduce it to streaming end-to-end speech translation (ST), which aims to …

被引用次数：21 相关文章所有 5 个版本

[PDF] arxiv.org

Mass: A large and clean multilingual corpus of sentence-aligned spoken utterances extracted from the bible

MZ Boito, WN Havard, M Garnerin, ÉL Ferrand… - arXiv preprint arXiv …, 2019 - arxiv.org

The CMU Wilderness Multilingual Speech Dataset (Black, 2019) is a newly published
multilingual speech dataset based on recorded readings of the New Testament. It provides …

被引用次数：55 相关文章所有 8 个版本

[PDF] arxiv.org

Bstc: A large-scale chinese-english speech translation dataset

R Zhang, X Wang, C Zhang, Z He, H Wu, Z Li… - arXiv preprint arXiv …, 2021 - arxiv.org

This paper presents BSTC (Baidu Speech Translation Corpus), a large-scale Chinese-
English speech translation dataset. This dataset is constructed based on a collection of …

被引用次数：30 相关文章所有 6 个版本

[PDF] arxiv.org

Blsp: Bootstrapping language-speech pre-training via behavior alignment of continuation writing

C Wang, M Liao, Z Huang, J Lu, J Wu, Y Liu… - arXiv preprint arXiv …, 2023 - arxiv.org

The emergence of large language models (LLMs) has sparked significant interest in
extending their remarkable language capabilities to speech. However, modality alignment …

被引用次数：5 相关文章所有 3 个版本

高级搜索

QQ 群