Extraction and utilization of excitation information of speech: A review

SR Kadiri, P Alku, B Yegnanarayana - Proceedings of the IEEE, 2021 - ieeexplore.ieee.org
Speech production can be regarded as a process where a time-varying vocal tract system
(filter) is excited by a time-varying excitation. In addition to its linguistic message, the speech …

Towards Automated Vocal Mode Classification in Healthy Singing Voice—An XGBoost Decision Tree-Based Machine Learning Classifier

J Sol, M Aaen, C Sadolin, L Ten Bosch - Journal of Voice, 2023 - Elsevier
Auditory-perceptual assessment is widely used in clinical and pedagogical practice for
speech and singing voice, yet several studies have shown poor intra-and inter-rater …

Mapping phonation types by clustering of multiple metrics

H Cai, S Ternström - Applied Sciences, 2022 - mdpi.com
Featured Application Categorical voice assessment based on a classification algorithm that
can assist clinicians by visualizing phonation types on a 2-D voice map. Abstract For voice …

[HTML][HTML] Investigation of self-supervised pre-trained models for classification of voice quality from speech and neck surface accelerometer signals

SR Kadiri, F Javanmardi, P Alku - Computer Speech & Language, 2024 - Elsevier
Prior studies in the automatic classification of voice quality have mainly studied the use of
the acoustic speech signal as input. Recently, a few studies have been carried out by jointly …

Disentangled adversarial domain adaptation for phonation mode detection in singing and speech

Y Wang, W Wei, X Gu, X Guan… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
Phonation mode detection predicts phonation modes and their temporal boundaries in
singing and speech, holding promise for characterizing voice quality and vocal health …

Mel-weighted single frequency filtering spectrogram for dialect identification

R Kethireddy, SR Kadiri, P Alku… - IEEE Access, 2020 - ieeexplore.ieee.org
In this study, we propose Mel-weighted single frequency filtering (SFF) spectrograms for
dialect identification. The spectrum derived using SFF has high spectral resolution for …

[HTML][HTML] The effect of the MFCC frame length in automatic voice pathology detection

S Tirronen, SR Kadiri, P Alku - Journal of Voice, 2024 - Elsevier
Automatic voice pathology detection is a research topic, which has gained increasing
interest recently. Although methods based on deep learning are becoming popular, the …

[HTML][HTML] Analysis of instantaneous frequency components of speech signals for epoch extraction

SR Kadiri, P Alku, B Yegnanarayana - Computer Speech & Language, 2023 - Elsevier
The major impulse-like excitation in the speech signal is due to abrupt closure of the vocal
folds, which takes place at the glottal closure instant (GCI) or epoch in each cycle. GCIs are …

Horizontal and vertical voice directivity characteristics of sung vowels in classical singing

M Brandner, M Frank, A Sontacchi - Acoustics, 2022 - mdpi.com
Singing voice directivity for five sustained German vowels/a:/,/e:/,/i:/,/o:/,/u:/over a wide pitch
range was investigated using a multichannel microphone array with high spatial resolution …

Classification of phonation modes in classical singing using modulation power spectral features

M Brandner, PA Bereuter, SR Kadiri… - IEEE Access, 2023 - ieeexplore.ieee.org
In singing, the perceptual term “voice quality” is used to describe expressed emotions and
singing styles. In voice physiology research, specific voice qualities are discussed using the …