Q Fang, R Ye, L Li, Y Feng, M Wang - arXiv preprint arXiv:2203.10426, 2022 - arxiv.org
How to learn a better speech representation for end-to-end speech-to-text translation (ST) with limited labeled data? Existing techniques often attempt to transfer powerful machine …
C Han, M Wang, H Ji, L Li - arXiv preprint arXiv:2105.03095, 2021 - arxiv.org
Having numerous potential applications and great impact, end-to-end speech translation (ST) has long been treated as an independent task, failing to fully draw strength from the …
M Sperber, M Paulik - arXiv preprint arXiv:2004.06358, 2020 - arxiv.org
Over its three decade history, speech translation has experienced several shifts in its primary research themes; moving from loosely coupled cascades of speech recognition and …
We introduce a novel sequence-to-sequence (seq2seq) voice conversion (VC) model based on the Transformer architecture with text-to-speech (TTS) pretraining. Seq2seq VC models …
X Ma, J Pino, P Koehn - arXiv preprint arXiv:2011.02048, 2020 - arxiv.org
Simultaneous text translation and end-to-end speech translation have recently made great progress but little work has combined these tasks together. We investigate how to adapt …
M Orken, O Dina, A Keylan, T Tolganay, O Mohamed - Scientific reports, 2022 - nature.com
Today, the Transformer model, which allows parallelization and also has its own internal attention, has been widely used in the field of speech recognition. The great advantage of …
Y Zhou, Q Fang, Y Feng - arXiv preprint arXiv:2305.14635, 2023 - arxiv.org
End-to-end speech translation (ST) is the task of translating speech signals in the source language into text in the target language. As a cross-modal task, end-to-end ST is difficult to …
Q Dong, R Ye, M Wang, H Zhou, S Xu, B Xu… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
An end-to-end speech-to-text translation (ST) takes audio in a source language and outputs the text in a target language. Existing methods are limited by the amount of parallel corpus …
Recent advances in deep learning show that end-to-end speech to text translation model is a promising approach to direct the speech translation field. In this work, we provide an …