High fidelity neural audio compression

A Défossez, J Copet, G Synnaeve, Y Adi - arXiv preprint arXiv:2210.13438, 2022 - arxiv.org
We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural
networks. It consists in a streaming encoder-decoder architecture with quantized latent …

Efficient neural music generation

MWY Lam, Q Tian, T Li, Z Yin, S Feng… - Advances in …, 2024 - proceedings.neurips.cc
Recent progress in music generation has been remarkably advanced by the state-of-the-art
MusicLM, which comprises a hierarchy of three LMs, respectively, for semantic, coarse …

A review of differentiable digital signal processing for music and speech synthesis

B Hayes, J Shier, G Fazekas, A McPherson… - Frontiers in Signal …, 2024 - frontiersin.org
The term “differentiable digital signal processing” describes a family of techniques in which
loss function gradients are backpropagated through digital signal processors, facilitating …

Multi-instrument music synthesis with spectrogram diffusion

C Hawthorne, I Simon, A Roberts, N Zeghidour… - arXiv preprint arXiv …, 2022 - arxiv.org
An ideal music synthesizer should be both interactive and expressive, generating high-
fidelity audio in realtime for arbitrary combinations of instruments and notes. Recent neural …

[HTML][HTML] Automated data processing and feature engineering for deep learning and big data applications: a survey

A Mumuni, F Mumuni - Journal of Information and Intelligence, 2024 - Elsevier
Modern approach to artificial intelligence (AI) aims to design algorithms that learn directly
from data. This approach has achieved impressive results and has contributed significantly …

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

H Siuzdak - arXiv preprint arXiv:2306.00814, 2023 - arxiv.org
Recent advancements in neural vocoding are predominantly driven by Generative
Adversarial Networks (GANs) operating in the time-domain. While effective, this approach …

Music controlnet: Multiple time-varying controls for music generation

SL Wu, C Donahue, S Watanabe… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
Text-to-music generation models are now capable of generating high-quality music audio in
broad styles. However, text control is primarily suitable for the manipulation of global musical …

DDX7: Differentiable FM synthesis of musical instrument sounds

F Caspe, A McPherson, M Sandler - arXiv preprint arXiv:2208.06169, 2022 - arxiv.org
FM Synthesis is a well-known algorithm used to generate complex timbre from a compact set
of design primitives. Typically featuring a MIDI interface, it is usually impractical to control it …

Long-term rhythmic video soundtracker

J Yu, Y Wang, X Chen, X Sun… - … Conference on Machine …, 2023 - proceedings.mlr.press
We consider the problem of generating musical soundtracks in sync with rhythmic visual
cues. Most existing works rely on pre-defined music representations, leading to the …

Musika! fast infinite waveform music generation

M Pasini, J Schlüter - arXiv preprint arXiv:2208.08706, 2022 - arxiv.org
Fast and user-controllable music generation could enable novel ways of composing or
performing music. However, state-of-the-art music generation systems require large …