Generative large language models (LLMs), eg, ChatGPT, have demonstrated remarkable proficiency across several NLP tasks such as machine translation, question answering, text …
Large-scale automatic speech translation systems today lack key features that help machine- mediated communication feel seamless when compared to human-to-human dialogue. In …
YA Mohamed, A Khanan, M Bashir… - Ieee …, 2024 - ieeexplore.ieee.org
In the context of a more linked and globalized society, the significance of proficient cross- cultural communication has been increasing to a position of utmost importance. Language …
D Rekesh, NR Koluguri, S Kriman… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Conformer-based models have become the dominant end-to-end architecture for speech processing tasks. With the objective of enhancing the conformer architecture for efficient …
Direct speech-to-speech translation (S2ST), in which all components can be optimized jointly, is advantageous over cascaded approaches to achieve fast inference with a …
We present Translatotron 2, a neural direct speech-to-speech translation model that can be trained end-to-end. Translatotron 2 consists of a speech encoder, a linguistic decoder, an …
B Zhang, B Haddow… - … conference on machine …, 2022 - proceedings.mlr.press
Abstract End-to-end (E2E) speech-to-text translation (ST) often depends on pretraining its encoder and/or decoder using source transcripts via speech recognition or text translation …
This paper reports on the shared tasks organized by the 20th IWSLT Conference. The shared tasks address 9 scientific challenges in spoken language translation: simultaneous …
The gap between speech and text modalities is a major challenge in speech-to-text translation (ST). Different methods have been proposed to reduce this gap, but most of them …