Learning speaker-specific characteristics with deep neural architecture

D Sztahó, G Szaszák, A Beke - arXiv preprint arXiv:1911.06615, 2019 - arxiv.org

This paper summarizes the applied deep learning practices in the field of speaker
recognition, both verification and identification. Speaker recognition has been a widely used …

被引用次数：82 相关文章所有 8 个版本

[PDF] danielpovey.com

X-vectors: Robust dnn embeddings for speaker recognition

D Snyder, D Garcia-Romero, G Sell… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org

In this paper, we use data augmentation to improve performance of deep neural network
(DNN) embeddings for speaker recognition. The DNN, which is trained to discriminate …

被引用次数：3420 相关文章所有 10 个版本

[PDF] isca-archive.org

[PDF][PDF] Deep neural network embeddings for text-independent speaker verification.

D Snyder, D Garcia-Romero, D Povey, S Khudanpur - Interspeech, 2017 - isca-archive.org

This paper investigates replacing i-vectors for text-independent speaker verification with
embeddings extracted from a feedforward deep neural network. Long-term speaker …

被引用次数：1117 相关文章所有 11 个版本

[PDF] danielpovey.com

Speaker recognition for multi-speaker conversations using x-vectors

D Snyder, D Garcia-Romero, G Sell… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

Recently, deep neural networks that map utterances to fixed-dimensional embeddings have
emerged as the state-of-the-art in speaker recognition. Our prior work introduced x-vectors …

被引用次数：393 相关文章所有 7 个版本

[PDF] danielpovey.com

Deep neural network-based speaker embeddings for end-to-end speaker verification

D Snyder, P Ghahremani, D Povey… - 2016 IEEE spoken …, 2016 - ieeexplore.ieee.org

In this study, we investigate an end-to-end text-independent speaker verification system. The
architecture consists of a deep neural network that takes a variable length speech segment …

被引用次数：461 相关文章所有 6 个版本

[PDF] arxiv.org

Attentive temporal pooling for conformer-based streaming language identification in long-form speech

Q Wang, Y Yu, J Pelecanos, Y Huang… - arXiv preprint arXiv …, 2022 - arxiv.org

In this paper, we introduce a novel language identification system based on conformer
layers. We propose an attentive temporal pooling mechanism to allow the model to carry …

被引用次数：16 相关文章所有 4 个版本

[PDF] google.com

Memory storable network based feature aggregation for speaker representation learning

B Gu, W Guo, J Zhang - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org

Learning fixed-dimensional speaker representation using deep neural networks is a key
step in speaker verification. In this work, we propose an auxiliary memory storable network …

被引用次数：11 相关文章所有 3 个版本

A Dynamic Convolution Framework for Session-Independent Speaker Embedding Learning

B Gu, J Zhang, W Guo - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org

Speaker verification (SV) has suffered from session variability in complex acoustic
scenarios, and learning session independent speaker representations remains a …

被引用次数：2 相关文章所有 2 个版本

[PDF] jhu.edu

[PDF][PDF] X-Vectors: Robust neural embeddings for speaker recognition

D Snyder - 2020 - jscholarship.library.jhu.edu

Speaker recognition is the task of identifying speakers based on their speech signal.
Typically, this involves comparing speech from a known speaker, with recordings from …

被引用次数：9 相关文章所有 2 个版本

A bayesian attention neural network layer for speaker recognition

W Zhu, J Pelecanos - ICASSP 2019-2019 IEEE International …, 2019 - ieeexplore.ieee.org

Neural network based attention modeling has found utility in areas such as visual analysis,
speech recognition and more recently speaker recognition. Attention represents a gating (or …

被引用次数：9 相关文章所有 2 个版本

高级搜索

QQ 群