Deep learning (DL), as one of the most active machine learning research fields, has achieved great success in numerous scientific and technological disciplines, including …
PW Koh, S Sagawa, H Marklund… - International …, 2021 - proceedings.mlr.press
Distribution shifts—where the training distribution differs from the test distribution—can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild …
Speech processing Universal PERformance Benchmark (SUPERB) is a leaderboard to benchmark the performance of Self-Supervised Learning (SSL) models on various speech …
K Park, T Mulc - arXiv preprint arXiv:1903.11269, 2019 - arxiv.org
We describe our development of CSS10, a collection of single speaker speech datasets for ten languages. It is composed of short audio clips from LibriVox audiobooks and their …
The growth in online child exploitation material is a significant challenge for European Law Enforcement Agencies (LEAs). One of the most important sources of such online information …
The goal of this work is to train strong models for visual speech recognition without requiring human annotated ground truth data. We achieve this by distilling from an Automatic Speech …
Collecting large amounts of data to train end-to-end Speech Translation (ST) models is more difficult compared to the ASR and MT tasks. Previous studies have proposed the use of …
B Milde, A Köhn - Speech communication; 13th ITG …, 2018 - ieeexplore.ieee.org
High quality Automatic Speech Recognition (ASR) is a prerequisite for speech-based applications and research. While state-of-the-art ASR software is freely available, the …
J Millet, JR King - arXiv preprint arXiv:2103.01032, 2021 - arxiv.org
Our ability to comprehend speech remains, to date, unrivaled by deep learning models. This feat could result from the brain's ability to fine-tune generic sound representations for speech …