An overview of lead and accompaniment separation in music

Z Rafii, A Liutkus, FR Stöter, SI Mimilakis… - … on Audio, Speech …, 2018 - ieeexplore.ieee.org
Popular music is often composed of an accompaniment and a lead component, the latter
typically consisting of vocals. Filtering such mixtures to extract one or both components has …

Fastpitch: Parallel text-to-speech with pitch prediction

A Łańcucki - ICASSP 2021-2021 IEEE International Conference …, 2021 - ieeexplore.ieee.org
We present FastPitch, a fully-parallel text-to-speech model based on FastSpeech,
conditioned on fundamental frequency contours. The model predicts pitch contours during …

[图书][B] Audio source separation and speech enhancement

E Vincent, T Virtanen, S Gannot - 2018 - books.google.com
Learn the technology behind hearing aids, Siri, and Echo Audio source separation and
speech enhancement aim to extract one or more source signals of interest from an audio …

webMUSHRA—A comprehensive framework for web-based listening tests

M Schoeffler, S Bartoschek… - Journal of …, 2018 - … .openresearchsoftware.metajnl.com
For a long time, many popular listening test methods, such as ITU-R BS. 1534 (MUSHRA),
could not be carried out as web-based listening tests, since established web standards did …

A comparison of discrete and soft speech units for improved voice conversion

B Van Niekerk, MA Carbonneau, J Zaïdi… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
The goal of voice conversion is to transform source speech into a target voice, keeping the
content unchanged. In this paper, we focus on self-supervised representation learning for …

FFTNet: A real-time speaker-dependent neural vocoder

Z Jin, A Finkelstein, GJ Mysore… - 2018 IEEE international …, 2018 - ieeexplore.ieee.org
We introduce FFTNet, a deep learning approach synthesizing audio waveforms. Our
approach builds on the recent WaveNet project, which showed that it was possible to …

A differentiable perceptual audio metric learned from just noticeable differences

P Manocha, A Finkelstein, R Zhang, NJ Bryan… - arXiv preprint arXiv …, 2020 - arxiv.org
Many audio processing tasks require perceptual assessment. The``gold standard``of
obtaining human judgments is time-consuming, expensive, and cannot be used as an …

[图书][B] Communication systems

BP Lathi - 1968 - everand.com
“Playing” with notation software, part 2 of 2: There are lots of ways you can manipulate a
notation file for playback purposes. Philip Rothman and David MacDonald continue a two …

Scene-aware audio rendering via deep acoustic analysis

Z Tang, NJ Bryan, D Li, TR Langlois… - IEEE transactions on …, 2020 - ieeexplore.ieee.org
We present a new method to capture the acoustic characteristics of real-world rooms using
commodity devices, and use the captured characteristics to generate similar sounding …

[HTML][HTML] Go listen: an end-to-end online listening test platform

D Barry, Q Zhang, PW Sun, A Hines - 2021 - openresearchsoftware.metajnl.com
Résumé Subjective listening tests are routinely conducted by academic researchers and
industry professionals to assess the quality of various speech and audio processing …