Advances in deep neural network approaches to speaker recognition

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier

Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

被引用次数：436 相关文章所有 9 个版本

Speaker identification features extraction methods: A systematic review

SS Tirumala, SR Shahamiri, AS Garhwal… - Expert Systems with …, 2017 - Elsevier

Speaker Identification (SI) is the process of identifying the speaker from a given utterance by
comparing the voice biometrics of the utterance with those utterance models stored …

被引用次数：238 相关文章所有 3 个版本

[PDF] ieee.org

Conv-tasnet: Surpassing ideal time–frequency magnitude masking for speech separation

Y Luo, N Mesgarani - IEEE/ACM transactions on audio, speech …, 2019 - ieeexplore.ieee.org

Single-channel, speaker-independent speech separation methods have recently seen great
progress. However, the accuracy, latency, and computational cost of such methods remain …

被引用次数：2220 相关文章所有 13 个版本

[PDF] researchgate.net

Speaker recognition from raw waveform with sincnet

M Ravanelli, Y Bengio - 2018 IEEE spoken language …, 2018 - ieeexplore.ieee.org

Deep learning is progressively gaining popularity as a viable alternative to i-vectors for
speaker recognition. Promising results have been recently obtained with Convolutional …

被引用次数：1010 相关文章所有 10 个版本

[PDF] danielpovey.com

X-vectors: Robust dnn embeddings for speaker recognition

D Snyder, D Garcia-Romero, G Sell… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org

In this paper, we use data augmentation to improve performance of deep neural network
(DNN) embeddings for speaker recognition. The DNN, which is trained to discriminate …

被引用次数：3420 相关文章所有 10 个版本

[PDF] arxiv.org

Attentive statistics pooling for deep speaker embedding

K Okabe, T Koshinaka, K Shinoda - arXiv preprint arXiv:1803.10963, 2018 - arxiv.org

This paper proposes attentive statistics pooling for deep speaker embedding in text-
independent speaker verification. In conventional speaker embedding, frame-level features …

被引用次数：665 相关文章所有 10 个版本

[PDF] arxiv.org

Stargan-vc2: Rethinking conditional methods for stargan-based voice conversion

T Kaneko, H Kameoka, K Tanaka, N Hojo - arXiv preprint arXiv …, 2019 - arxiv.org

Non-parallel multi-domain voice conversion (VC) is a technique for learning mappings
among multiple domains without relying on parallel data. This is important but challenging …

被引用次数：183 相关文章所有 7 个版本

[PDF] arxiv.org

Real-time, universal, and robust adversarial attacks against speaker recognition systems

Y Xie, C Shi, Z Li, J Liu, Y Chen… - ICASSP 2020-2020 IEEE …, 2020 - ieeexplore.ieee.org

As the popularity of voice user interface (VUI) exploded in recent years, speaker recognition
system has emerged as an important medium of identifying a speaker in many security …

被引用次数：110 相关文章所有 13 个版本

[PDF] arxiv.org

Probing the information encoded in x-vectors

D Raj, D Snyder, D Povey… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org

Deep neural network based speaker embeddings, such as x-vectors, have been shown to
perform well in text-independent speaker recognition/verification tasks. In this paper, we use …

被引用次数：124 相关文章所有 11 个版本

[PDF] arxiv.org

Deep representation learning in speech processing: Challenges, recent advances, and future trends

S Latif, R Rana, S Khalifa, R Jurdak, J Qadir… - arXiv preprint arXiv …, 2020 - arxiv.org

Research on speech processing has traditionally considered the task of designing hand-
engineered acoustic features (feature engineering) as a separate distinct problem from the …

被引用次数：115 相关文章所有 3 个版本

高级搜索

QQ 群