Simple and controllable music generation

J Copet, F Kreuk, I Gat, T Remez… - Advances in …, 2024 - proceedings.neurips.cc
We tackle the task of conditional music generation. We introduce MusicGen, a single
Language Model (LM) that operates over several streams of compressed discrete music …

High fidelity neural audio compression

A Défossez, J Copet, G Synnaeve, Y Adi - arXiv preprint arXiv:2210.13438, 2022 - arxiv.org
We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural
networks. It consists in a streaming encoder-decoder architecture with quantized latent …

Real time speech enhancement in the waveform domain

A Defossez, G Synnaeve, Y Adi - arXiv preprint arXiv:2006.12847, 2020 - arxiv.org
We present a causal speech enhancement model working on the raw waveform that runs in
real-time on a laptop CPU. The proposed model is based on an encoder-decoder …

[PDF][PDF] Spleeter: a fast and efficient music source separation tool with pre-trained models

R Hennequin, A Khlif, F Voituret… - Journal of Open Source …, 2020 - joss.theoj.org
We present and release a new tool for music source separation with pre-trained models
called Spleeter. Spleeter was designed with ease of use, separation performance, and …

Hybrid spectrogram and waveform source separation

A Défossez - arXiv preprint arXiv:2111.03600, 2021 - arxiv.org
Source separation models either work on the spectrogram or waveform domain. In this work,
we show how to perform end-to-end hybrid source separation, letting the model decide …

Hybrid transformers for music source separation

S Rouard, F Massa, A Défossez - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
A natural question arising in Music Source Separation (MSS) is whether long range
contextual information is useful, or whether local acoustic features are sufficient. In other …

Music source separation with band-split RNN

Y Luo, J Yu - IEEE/ACM Transactions on Audio, Speech, and …, 2023 - ieeexplore.ieee.org
The performance of music source separation (MSS) models has been greatly improved in
recent years thanks to the development of novel neural network architectures and training …

Cold diffusion for speech enhancement

H Yen, FG Germain, G Wichern… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Diffusion models have recently shown promising results for difficult enhancement tasks such
as the conditional and unconditional restoration of natural images and audio signals. In this …

Voice separation with an unknown number of multiple speakers

E Nachmani, Y Adi, L Wolf - International Conference on …, 2020 - proceedings.mlr.press
We present a new method for separating a mixed audio sequence, in which multiple voices
speak simultaneously. The new method employs gated neural networks that are trained to …

Sudo rm-rf: Efficient networks for universal audio source separation

E Tzinis, Z Wang, P Smaragdis - 2020 IEEE 30th International …, 2020 - ieeexplore.ieee.org
In this paper, we present an efficient neural network for end-to-end general purpose audio
source separation. Specifically, the backbone structure of this convolutional network is the …