Joint robust voicing detection and pitch estimation based on residual harmonics

T Drugman, A Alwan - arXiv preprint arXiv:2001.00459, 2019 - arxiv.org
This paper focuses on the problem of pitch tracking in noisy conditions. A method using
harmonic information in the residual signal is presented. The proposed criterion is used both …

Unsupervised speech activity detection using voicing measures and perceptual spectral flux

SO Sadjadi, JHL Hansen - IEEE signal processing letters, 2013 - ieeexplore.ieee.org
Effective speech activity detection (SAD) is a necessary first step for robust speech
applications. In this letter, we propose a robust and unsupervised SAD solution that …

Personal VAD: Speaker-conditioned voice activity detection

S Ding, Q Wang, S Chang, L Wan… - arXiv preprint arXiv …, 2019 - arxiv.org
In this paper, we propose" personal VAD", a system to detect the voice activity of a target
speaker at the frame level. This system is useful for gating the inputs to a streaming on …

Voice activity detection using an adaptive context attention model

J Kim, M Hahn - IEEE Signal Processing Letters, 2018 - ieeexplore.ieee.org
Voice activity detection (VAD) classifies incoming signal segments into speech or
background noise; its performance is crucial in various speech-related applications …

On training targets for noise-robust voice activity detection

S Braun, I Tashev - 2021 29th European Signal Processing …, 2021 - ieeexplore.ieee.org
The task of voice activity detection (VAD) is an often required module in various speech
processing, analysis and classification tasks. While state-of-the-art neural network based …

Innovative method for unsupervised voice activity detection and classification of audio segments

Z Ali, M Talha - Ieee Access, 2018 - ieeexplore.ieee.org
An accurate and noise-robust voice activity detection (VAD) system can be widely used for
emerging speech technologies in the fields of audio forensics, wireless communication, and …

[PDF][PDF] Counting and exploring sizes of Markov equivalence classes of directed acyclic graphs

Y He, J Jia, B Yu - The Journal of Machine Learning Research, 2015 - jmlr.org
When learning a directed acyclic graph (DAG) model via observational data, one generally
cannot identify the underlying DAG, but can potentially obtain a Markov equivalence class …

Multimodal gesture recognition via multiple hypotheses rescoring

V Pitsikalis, A Katsamanis, S Theodorakis… - Gesture recognition, 2017 - Springer
We present a new framework for multimodal gesture recognition that is based on a multiple
hypotheses rescoring fusion scheme. We specifically deal with a demanding Kinect-based …

[PDF][PDF] Supervised/unsupervised voice activity detectors for text-dependent speaker recognition on the RSR2015 corpus

J Alam, P Kenny, P Ouellet, T Stafylakis… - Odyssey speaker and …, 2014 - isca-archive.org
Voice activity detection, ie, discrimination of the speech/nonspeech segments in a speech
signal, is an important enabling technology for a variety of speech-based applications …

Systems and methods for audio signal processing

E Visser, LH Kim, J Shin, Y Guo, S Ryu… - US Patent …, 2016 - Google Patents
(57) ABSTRACT A method for signal level matching by an electronic device is described.
The method includes capturing a plurality of audio signals from a plurality of microphones …