X-vectors: Robust dnn embeddings for speaker recognition

D Snyder, D Garcia-Romero, G Sell… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org
In this paper, we use data augmentation to improve performance of deep neural network
(DNN) embeddings for speaker recognition. The DNN, which is trained to discriminate …

Speaker recognition for multi-speaker conversations using x-vectors

D Snyder, D Garcia-Romero, G Sell… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
Recently, deep neural networks that map utterances to fixed-dimensional embeddings have
emerged as the state-of-the-art in speaker recognition. Our prior work introduced x-vectors …

Privacy implications of voice and speech analysis–information disclosure by inference

JL Kröger, OHM Lutz, P Raschke - … Management. Data for Better Living: AI …, 2020 - Springer
Internet-connected devices, such as smartphones, smartwatches, and laptops, have become
ubiquitous in modern life, reaching ever deeper into our private spheres. Among the sensors …

The voices from a distance challenge 2019 evaluation plan

MK Nandwana, J Van Hout, M McLaren… - arXiv preprint arXiv …, 2019 - arxiv.org
The" VOiCES from a Distance Challenge 2019" is designed to foster research in the area of
speaker recognition and automatic speech recognition (ASR) with the special focus on …

[PDF][PDF] MagNetO: X-vector Magnitude Estimation Network plus Offset for Improved Speaker Recognition.

D Garcia-Romero, G Sell, A Mccree - Odyssey, 2020 - isca-archive.org
We present a magnitude estimation network that is combined with a modified ResNet x-
vector system to generate embeddings whose inner product is able to produce calibrated …

[PDF][PDF] x-vector DNN refinement with full-length recordings for speaker recognition.

D Garcia-Romero, D Snyder, G Sell, A McCree… - Interspeech, 2019 - danielpovey.com
State-of-the-art text-independent speaker recognition systems for long recordings (a few
minutes) are based on deep neural network (DNN) speaker embeddings. Current …

On deep speaker embeddings for text-independent speaker recognition

S Novoselov, A Shulipa, I Kremnev, A Kozlov… - arXiv preprint arXiv …, 2018 - arxiv.org
We investigate deep neural network performance in the textindependent speaker
recognition task. We demonstrate that using angular softmax activation at the last …

How to improve your speaker embeddings extractor in generic toolkits

H Zeinali, L Burget, J Rohdin… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
Recently, speaker embeddings extracted with deep neural networks became the state-of-the-
art method for speaker verification. In this paper we aim to facilitate its implementation on a …

[PDF][PDF] Triplet Loss Based Cosine Similarity Metric Learning for Text-independent Speaker Recognition.

S Novoselov, V Shchemelinin, A Shulipa, A Kozlov… - Interspeech, 2018 - isca-archive.org
Deep neural network based speaker embeddings become increasingly popular in the text-
independent speaker recognition task. In contrast to a generatively trained i-vector extractor …

Self-supervised speaker embeddings

T Stafylakis, J Rohdin, O Plchot, P Mizera… - arXiv preprint arXiv …, 2019 - arxiv.org
Contrary to i-vectors, speaker embeddings such as x-vectors are incapable of leveraging
unlabelled utterances, due to the classification loss over training speakers. In this paper, we …