Fast and easy crowdsourced perceptual audio evaluation

Z Rafii, A Liutkus, FR Stöter, SI Mimilakis… - … on Audio, Speech …, 2018 - ieeexplore.ieee.org

Popular music is often composed of an accompaniment and a lead component, the latter
typically consisting of vocals. Filtering such mixtures to extract one or both components has …

被引用次数：135 相关文章所有 15 个版本

[PDF] arxiv.org

Fastpitch: Parallel text-to-speech with pitch prediction

A Łańcucki - ICASSP 2021-2021 IEEE International Conference …, 2021 - ieeexplore.ieee.org

We present FastPitch, a fully-parallel text-to-speech model based on FastSpeech,
conditioned on fundamental frequency contours. The model predicts pitch contours during …

被引用次数：395 相关文章所有 3 个版本

[图书][B] Audio source separation and speech enhancement

E Vincent, T Virtanen, S Gannot - 2018 - books.google.com

Learn the technology behind hearing aids, Siri, and Echo Audio source separation and
speech enhancement aim to extract one or more source signals of interest from an audio …

被引用次数：314 相关文章所有 8 个版本

[PDF] metajnl.com

webMUSHRA—A comprehensive framework for web-based listening tests

M Schoeffler, S Bartoschek… - Journal of …, 2018 - … .openresearchsoftware.metajnl.com

For a long time, many popular listening test methods, such as ITU-R BS. 1534 (MUSHRA),
could not be carried out as web-based listening tests, since established web standards did …

被引用次数：286 相关文章所有 8 个版本

[PDF] arxiv.org

A comparison of discrete and soft speech units for improved voice conversion

B Van Niekerk, MA Carbonneau, J Zaïdi… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

The goal of voice conversion is to transform source speech into a target voice, keeping the
content unchanged. In this paper, we focus on self-supervised representation learning for …

被引用次数：109 相关文章所有 5 个版本

[PDF] princeton.edu

FFTNet: A real-time speaker-dependent neural vocoder

Z Jin, A Finkelstein, GJ Mysore… - 2018 IEEE international …, 2018 - ieeexplore.ieee.org

We introduce FFTNet, a deep learning approach synthesizing audio waveforms. Our
approach builds on the recent WaveNet project, which showed that it was possible to …

被引用次数：140 相关文章所有 9 个版本

[PDF] arxiv.org

A differentiable perceptual audio metric learned from just noticeable differences

P Manocha, A Finkelstein, R Zhang, NJ Bryan… - arXiv preprint arXiv …, 2020 - arxiv.org

Many audio processing tasks require perceptual assessment. The``gold standard``of
obtaining human judgments is time-consuming, expensive, and cannot be used as an …

被引用次数：86 相关文章所有 9 个版本

[HTML] everand.com

[图书][B] Communication systems

BP Lathi - 1968 - everand.com

“Playing” with notation software, part 2 of 2: There are lots of ways you can manipulate a
notation file for playback purposes. Philip Rothman and David MacDonald continue a two …

被引用次数：264 相关文章所有 2 个版本

[PDF] arxiv.org

Scene-aware audio rendering via deep acoustic analysis

Z Tang, NJ Bryan, D Li, TR Langlois… - IEEE transactions on …, 2020 - ieeexplore.ieee.org

We present a new method to capture the acoustic characteristics of real-world rooms using
commodity devices, and use the captured characteristics to generate similar sounding …

被引用次数：46 相关文章所有 7 个版本

[HTML] metajnl.com

[HTML][HTML] Go listen: an end-to-end online listening test platform

D Barry, Q Zhang, PW Sun, A Hines - 2021 - openresearchsoftware.metajnl.com

Résumé Subjective listening tests are routinely conducted by academic researchers and
industry professionals to assess the quality of various speech and audio processing …

被引用次数：31 相关文章所有 5 个版本

高级搜索

QQ 群