Learning when to translate for streaming speech

Q Dong, Y Zhu, M Wang, L Li - arXiv preprint arXiv:2109.07368, 2021 - arxiv.org
How to find proper moments to generate partial sentence translation given a streaming
speech input? Existing approaches waiting-and-translating for a fixed duration often break …

Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens

N San, G Paraskevopoulos, A Arora, X He… - arXiv preprint arXiv …, 2024 - arxiv.org
While massively multilingual speech models like wav2vec 2.0 XLSR-128 can be directly fine-
tuned for automatic speech recognition (ASR), downstream performance can still be …

Learning from failure: Data capture in an australian aboriginal community

É Le Ferrand, S Bird, L Besacier - … of the 60th Annual Meeting of …, 2022 - aclanthology.org
Most low resource language technology development is premised on the need to collect
data for training statistical models. When we follow the typical process of recording and …

Local word discovery for interactive transcription

W Lane, S Bird - Proceedings of the 2021 Conference on …, 2021 - aclanthology.org
Human expertise and the participation of speech communities are essential factors in the
success of technologies for low-resource languages. Accordingly, we propose a new …

Plug-and-Play Multilingual Few-shot Spoken Words Recognition

A Saeed, V Tsouvalas - arXiv preprint arXiv:2305.03058, 2023 - arxiv.org
As technology advances and digital devices become prevalent, seamless human-machine
communication is increasingly gaining significance. The growing adoption of mobile …

[PDF][PDF] 3.3 ACL 2022: deployment of the sparse transcription simulation

É Le Ferrand, S Bird, LBL From - Leveraging Speech Recognition for …, 2023 - hal.science
Most low resource language technology development is premised on the need to collect
data for training statistical models. When we follow the typical process of recording and …