How to train your speaker embeddings extractor

D Snyder, D Garcia-Romero, G Sell… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

Recently, deep neural networks that map utterances to fixed-dimensional embeddings have
emerged as the state-of-the-art in speaker recognition. Our prior work introduced x-vectors …

被引用次数：394 相关文章所有 7 个版本

[PDF] arxiv.org

The voices from a distance challenge 2019 evaluation plan

MK Nandwana, J Van Hout, M McLaren… - arXiv preprint arXiv …, 2019 - arxiv.org

The" VOiCES from a Distance Challenge 2019" is designed to foster research in the area of
speaker recognition and automatic speech recognition (ASR) with the special focus on …

被引用次数：107 相关文章所有 7 个版本

[PDF] isca-archive.org

[PDF][PDF] MagNetO: X-vector Magnitude Estimation Network plus Offset for Improved Speaker Recognition.

D Garcia-Romero, G Sell, A Mccree - Odyssey, 2020 - isca-archive.org

We present a magnitude estimation network that is combined with a modified ResNet x-
vector system to generate embeddings whose inner product is able to produce calibrated …

被引用次数：75 相关文章所有 4 个版本

[PDF] ieee.org

Deep CNNs with self-attention for speaker identification

NN An, NQ Thanh, Y Liu - IEEE access, 2019 - ieeexplore.ieee.org

Most current works on speaker identification are based on i-vector methods; however, there
is a marked shift from the traditional i-vector to deep learning methods, especially in the form …

被引用次数：91 相关文章所有 3 个版本

[PDF] danielpovey.com

[PDF][PDF] x-vector DNN refinement with full-length recordings for speaker recognition.

D Garcia-Romero, D Snyder, G Sell, A McCree… - Interspeech, 2019 - danielpovey.com

State-of-the-art text-independent speaker recognition systems for long recordings (a few
minutes) are based on deep neural network (DNN) speaker embeddings. Current …

被引用次数：57 相关文章所有 10 个版本

[PDF] researchgate.net

[PDF][PDF] Speaker Augmentation and Bandwidth Extension for Deep Speaker Embedding.

H Yamamoto, KA Lee, K Okabe, T Koshinaka - Interspeech, 2019 - researchgate.net

This paper investigates a novel data augmentation approach to train deep neural networks
(DNNs) used for speaker embedding, ie to extract representation that allows easy …

被引用次数：64 相关文章所有 6 个版本

JHU-HLTCOE system for the VoxSRC speaker recognition challenge

D Garcia-Romero, A McCree… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

The VoxSRC speaker recognition challenge comprises data obtained from YouTube videos
of celebrity interviews in a wide range of recording environments. The challenge provides …

被引用次数：52 相关文章

[PDF] ed.ac.uk

Disentangling style factors from speaker representations

J Williams, S King - 20th Annual Conference of the International …, 2019 - research.ed.ac.uk

Our goal is to separate out speaking style from speaker identity in utterance-level
representations of speech such as i-vectors and x-vectors. We first show that both i-vectors …

被引用次数：59 相关文章所有 7 个版本

[PDF] arxiv.org

How to improve your speaker embeddings extractor in generic toolkits

H Zeinali, L Burget, J Rohdin… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

Recently, speaker embeddings extracted with deep neural networks became the state-of-the-
art method for speaker verification. In this paper we aim to facilitate its implementation on a …

被引用次数：62 相关文章所有 5 个版本

[PDF] arxiv.org

A speaker verification backend with robust performance across conditions

L Ferrer, M McLaren, N Brümmer - Computer Speech & Language, 2022 - Elsevier

In this paper, we address the problem of speaker verification in conditions unseen or
unknown during development. A standard method for speaker verification consists of …

被引用次数：33 相关文章所有 7 个版本

高级搜索

QQ 群