Performance analysis of several pitch detection algorithms on simulated and real noisy speech data

D Jouvet, Y Laprie - 2017 25th european signal processing …, 2017 - ieeexplore.ieee.org
This paper analyses the performance of a large bunch of pitch detection algorithms on clean
and noisy speech data. Two sets of noisy speech data are considered. One corresponds to …

Speaker identification based on normalized pitch frequency and Mel Frequency Cepstral Coefficients

MA Nasr, M Abd-Elnaby, AS El-Fishawy… - International Journal of …, 2018 - Springer
This paper presents an efficient approach for automatic speaker identification based on
cepstral features and the Normalized Pitch Frequency (NPF). Most relevant speaker …

Fundamental frequency estimation of HERM lines of drones

A Huang, P Sévigny, B Balaji… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
Most research on drone detection and classification focus on using features from micro-
Doppler signatures with blade flashes. However, these methods are limited in range and …

A pitch estimation algorithm for speech in complex noise environments based on the radon transform

B Li, X Zhang - IEEE Access, 2023 - ieeexplore.ieee.org
The pitch period as an essential feature is used in various speech-related works. Most actual
projects collect speech signals in complex noise environments. Thus, the noise resistance of …

Detection of copy-move forgery in audio signal with mel frequency and delta-mel frequency kepstrum coefficients

F Akdeniz, Y Becerikli - 2021 Innovations in Intelligent Systems …, 2021 - ieeexplore.ieee.org
Digital multimedia security has taken a very important position with the developing
technology. Detecting forgeries in audio signals is one of the most challenging application in …

Improved visual focus of attention estimation and prosodic features for analyzing group interactions

L Zhang, M Morgan, I Bhattacharya, M Foley… - 2019 International …, 2019 - dl.acm.org
Collaborative group tasks require efficient and productive verbal and non-verbal interactions
among the participants. Studying such interaction patterns could help groups perform more …

Pronunciation scoring with goodness of pronunciation and dynamic time warping

K Sheoran, A Bajgoti, R Gupta, N Jatana… - IEEE …, 2023 - ieeexplore.ieee.org
The current pronunciation scoring based on Goodness of Pronunciation (GOP) uses
posterior probabilities of the Acoustic Models. Such algorithms suffer from generalization …

Noisevc: Towards high quality zero-shot voice conversion

S Wang, D Borth - arXiv preprint arXiv:2104.06074, 2021 - arxiv.org
Voice conversion (VC) is a task that transforms voice from target audio to source without
losing linguistic contents, it is challenging especially when source and target speakers are …

Linear prediction coefficients based copy-move forgery detection in audio signal

F Akdeniz, Y Becerikli - 2022 International Symposium on …, 2022 - ieeexplore.ieee.org
With the advancement digitalization, the issue of digital multimedia security has become one
of the essential research areas. Digital multimedia security includes the analysis and …

Towards image-based laryngeal videostroboscopy using deep learning-enabled compressed sensing

AM Wölfl, A Schützenberger, K Breininger… - … Signal Processing and …, 2023 - Elsevier
Laryngeal videostroboscopy is an audio-mediated imaging technique allowing the
visualization of vocal fold oscillation behavior: the audio signal is used to determine the …