We present RemixIT, a simple yet effective self-supervised method for training speech enhancement without the need of a single isolated in-domain speech nor a noise waveform …
G Fabbro, S Uhlich, CH Lai… - Trans. Int. Soc …, 2024 - account.transactions.ismir.net
This paper summarizes the music demixing (MDX) track of the Sound Demixing Challenge (SDX'23). We provide a summary of the challenge setup and introduce the task of robust …
C Gupta, H Li, M Goto - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Singing, the vocal productionof musical tones, is one of the most important elements of music. Addressing the needs of real-world applications, the study of technologies related to …
FQ Li, SL Wang, Y Zhu - ICASSP 2022-2022 IEEE International …, 2022 - ieeexplore.ieee.org
The wide application of deep learning techniques is boosting the regulation of deep learning models, especially deep neural networks (DNN), as commercial products. A necessary …
We propose RemixIT, a simple and novel self-supervised training method for speech enhancement. The proposed method is based on a continuously self-training scheme that …
Singing voice separation aims to separate music into vocals and accompaniment components. One of the major constraints for the task is the limited amount of training data …
Z Zhu, Y Sato - 2021 IEEE Automatic Speech Recognition and …, 2021 - ieeexplore.ieee.org
The collection of large amounts of labeled data for speech emotion recognition requires considerable time and effort. As a result, the sizes of existing corpora are limited. One …
In this work, we propose Exformer, a time-domain architecture for target speaker extraction. It consists of a pre-trained speaker embedder network and a separator network based on …
W Yuan, S Wang, J Wang, M Unoki… - IEEE/ACM transactions …, 2023 - ieeexplore.ieee.org
Learning effective vocal representations from a waveform mixture is a crucial but challenging task for deep neural network (DNN)-based singing voice separation (SVS) …