i-Vectors in speech processing applications: a survey

P Verma, PK Das - International Journal of Speech Technology, 2015 - Springer
In the domain of speech recognition many methods have been proposed over time like
Gaussian mixture models (GMM), GMM with universal background model (GMM-UBM …

Improvements to the IBM speech activity detection system for the DARPA RATS program

S Thomas, G Saon, M Van Segbroeck… - … , Speech and Signal …, 2015 - ieeexplore.ieee.org
In this paper we describe improvements to the IBM speech activity detection (SAD) system
for the third phase of the DARPA RATS program. The progress during this final phase comes …

[PDF][PDF] Automatic estimation of parkinson's disease severity from diverse speech tasks.

J Kim, M Nasir, R Gupta, M Van Segbroeck, D Bone… - Interspeech, 2015 - researchgate.net
The need for reliable, scalable and efficient diagnosis of Parkinson's Disease (PD) is a
major clinical need. Automating the diagnosis can lead to more accurate and objective …

[PDF][PDF] Modified-prior i-vector estimation for language identification of short duration utterances

R Travadi, MV Segbroeck, SS Narayanan - Fifteenth Annual Conference …, 2014 - ict.usc.edu
In this paper, we address the problem of Language Identification (LID) on short duration
segments. Current state-of-the-art LID systems typically employ total variability i-Vector …

Rapid language identification

M Van Segbroeck, R Travadi… - IEEE/ACM Transactions …, 2015 - ieeexplore.ieee.org
A critical challenge to automatic language identification (LID) is achieving accurate
performance with the shortest possible speech segment in a rapid fashion. The accuracy to …

Leveraging frequency-dependent kernel and dip-based clustering for robust speech activity detection in naturalistic audio streams

H Dubey, A Sangwan… - IEEE/ACM Transactions on …, 2018 - ieeexplore.ieee.org
Speech activity detection (SAD) is front-end in most speech systems, eg, speaker
verification, speech recognition etc. Supervised SAD typically leverages machine learning …

I-vector and variability compensation techniques for mobile phone recognition

A Alimohad, M Bengherabi… - … in Engineering and …, 2024 - ojs.studiespublicacoes.com.br
Mobile phone recognition consists of trying to identify the mobile phone brand or model,
which is very important in forensic analysis. In this paper, we exploit the audio recordings to …

[图书][B] Deep Neural Networks and Model-Based Approaches for Robust Speaker Diarization in Naturalistic Audio Streams

H Dubey - 2019 - search.proquest.com
Speaker diarization is an unsupervised task that determines" who spoke and when" within
input audio stream. It consists of four sub-systems:(i) speech activity detection (SAD);(ii) …

Advances in Feature Extraction and Modelling for Short Duration Language Identification

S Fernando, S Irtza, V Sethu… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
This paper presents an overview of the progression of short duration spoken language
identification systems and current developments. It reviews different language identification …

Deep learning approaches to feature extraction, modelling and compensation for short duration language identification

W Fernando - 2018 - unsworks.unsw.edu.au
Speech signals carry information about the speaker, gender, emotion, and language in
addition to the message being conveyed. A number of inference systems that automatically …