Conventional and contemporary approaches used in text to speech synthesis: A review

N Kaur, P Singh - Artificial Intelligence Review, 2023 - Springer
Nowadays speech synthesis or text to speech (TTS), an ability of system to produce human
like natural sounding voice from the written text, is gaining popularity in the field of speech …

Signal processing methods for music transcription

A Klapuri, M Davy - 2007 - books.google.com
Signal Processing Methods for Music Transcription is the first book dedicated to uniting
research related to signal processing algorithms and models for various aspects of music …

[PDF][PDF] Acoustic properties of different kinds of creaky voice.

PA Keating, M Garellek, J Kreiman - ICPhS, 2015 - idiom.ucsd.edu
There is not one kind, but instead several kinds, of creaky voice, or creak. There is no single
defining property shared by all kinds. Instead, each kind exhibits some properties but not …

YIN, a fundamental frequency estimator for speech and music

A De Cheveigné, H Kawahara - The Journal of the Acoustical Society …, 2002 - pubs.aip.org
An algorithm is presented for the estimation of the fundamental frequency (F 0) of speech or
musical sounds. It is based on the well-known autocorrelation method with a number of …

The Montreal Affective Voices: A validated set of nonverbal affect bursts for research on auditory affective processing

P Belin, S Fillion-Bilodeau, F Gosselin - Behavior research methods, 2008 - Springer
Abstract The Montreal Affective Voices consist of 90 nonverbal affect bursts corresponding to
the emotions of anger, disgust, fear, pain, sadness, surprise, happiness, and pleasure (plus …

Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity …

H Kawahara, M Morise, T Takahashi… - … on acoustics, speech …, 2008 - ieeexplore.ieee.org
A simple new method for estimating temporally stable power spectra is introduced to provide
a unified basis for computing an interference-free spectrum, the fundamental frequency (F0) …

A sawtooth waveform inspired pitch estimator for speech and music

A Camacho, JG Harris - The Journal of the Acoustical Society of …, 2008 - pubs.aip.org
A sawtooth waveform inspired pitch estimator (SWIPE) has been developed for speech and
music. SWIPE estimates the pitch as the fundamental frequency of the sawtooth waveform …

Joint robust voicing detection and pitch estimation based on residual harmonics

T Drugman, A Alwan - arXiv preprint arXiv:2001.00459, 2019 - arxiv.org
This paper focuses on the problem of pitch tracking in noisy conditions. A method using
harmonic information in the residual signal is presented. The proposed criterion is used both …

STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds

H Kawahara - Acoustical science and technology, 2006 - jstage.jst.go.jp
STRAIGHT, a speech analysis, modification synthesis system, is an extension of the
classical channel VOCODER that exploits the advantages of progress in information …

[PDF][PDF] Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis …

H Kawahara, J Estill, O Fujimura - … on models and analysis of vocal …, 2001 - isca-archive.org
A new control paradigm of source signals for high quality speech synthesis is introduced to
handle a variety of speech quality, based on timefrequency analyses by the use of an …