M4singer: A multi-style, multi-singer and musical score provided mandarin singing corpus

L Zhang, R Li, S Wang, L Deng, J Liu… - Advances in …, 2022 - proceedings.neurips.cc
The lack of publicly available high-quality and accurately labeled datasets has long been a
major bottleneck for singing voice synthesis (SVS). To tackle this problem, we present …

Gibbsddrm: A partially collapsed gibbs sampler for solving blind inverse problems with denoising diffusion restoration

N Murata, K Saito, CH Lai, Y Takida… - International …, 2023 - proceedings.mlr.press
Pre-trained diffusion models have been successfully used as priors in a variety of linear
inverse problems, where the goal is to reconstruct a signal from noisy linear measurements …

Opencpop: A high-quality open source chinese popular song corpus for singing voice synthesis

Y Wang, X Wang, P Zhu, J Wu, H Li, H Xue… - arXiv preprint arXiv …, 2022 - arxiv.org
This paper introduces Opencpop, a publicly available high-quality Mandarin singing corpus
designed for singing voice synthesis (SVS). The corpus consists of 100 popular Mandarin …

The singing voice conversion challenge 2023

WC Huang, LP Violeta, S Liu, J Shi… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
We present the latest iteration of the voice conversion challenge (VCC) series, a bi-annual
scientific event aiming to compare and understand different voice conversion (VC) systems …

Globally, songs and instrumental melodies are slower and higher and use more stable pitches than speech: A Registered Report

Y Ozaki, A Tierney, PQ Pfordresher, JM McBride… - Science …, 2024 - science.org
Both music and language are found in all known human societies, yet no studies have
compared similarities and differences between song, speech, and instrumental music on a …

Unsupervised vocal dereverberation with diffusion-based generative models

K Saito, N Murata, T Uesaka, CH Lai… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Removing reverb from reverberant music is a necessary technique to clean up audio for
downstream music manipulations. Reverberation of music contains two categories, natural …

Singing voice data scaling-up: An introduction to ace-opencpop and kising-v2

J Shi, Y Lin, X Bai, K Zhang, Y Wu, Y Tang, Y Yu… - arXiv preprint arXiv …, 2024 - arxiv.org
In singing voice synthesis (SVS), generating singing voices from musical scores faces
challenges due to limited data availability, a constraint less common in text-to-speech (TTS) …

Hierarchical diffusion models for singing voice neural vocoder

N Takahashi, M Kumar, Y Mitsufuji - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Recent progress in deep generative models has improved the quality of neural vocoders in
speech domain. However, generating a high-quality singing voice remains challenging due …

Deep learning approaches in topics of singing information processing

C Gupta, H Li, M Goto - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Singing, the vocal productionof musical tones, is one of the most important elements of
music. Addressing the needs of real-world applications, the study of technologies related to …

Globally, songs and instrumental melodies are slower, higher, and use more stable pitches than speech [Stage 2 Registered Report]

What, if any, similarities and differences between music and speech are consistent across
cultures? Both music and language are found in all known human societies and are argued …