A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Speaker recognition based on deep learning: An overview

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

X-vectors: Robust dnn embeddings for speaker recognition

D Snyder, D Garcia-Romero, G Sell… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org
In this paper, we use data augmentation to improve performance of deep neural network
(DNN) embeddings for speaker recognition. The DNN, which is trained to discriminate …

Biometrics recognition using deep learning: A survey

S Minaee, A Abdolrashidi, H Su, M Bennamoun… - Artificial Intelligence …, 2023 - Springer
In the past few years, deep learning-based models have been very successful in achieving
state-of-the-art results in many tasks in computer vision, speech recognition, and natural …

Large-scale self-supervised speech representation learning for automatic speaker verification

Z Chen, S Chen, Y Wu, Y Qian, C Wang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
The speech representations learned from large-scale unlabeled data have shown better
generalizability than those from supervised learning and thus attract a lot of interest to be …

Deep residual learning for small-footprint keyword spotting

R Tang, J Lin - … Conference on Acoustics, Speech and Signal …, 2018 - ieeexplore.ieee.org
We explore the application of deep residual learning and dilated convolutions to the
keyword spotting task, using the recently-released Google Speech Commands Dataset as …

Black-box adversarial attacks on commercial speech platforms with minimal information

B Zheng, P Jiang, Q Wang, Q Li, C Shen… - Proceedings of the …, 2021 - dl.acm.org
Adversarial attacks against commercial black-box speech platforms, including cloud speech
APIs and voice control devices, have received little attention until recent years. Constructing …

Speech processing for digital home assistants: Combining signal processing with deep-learning techniques

R Haeb-Umbach, S Watanabe… - IEEE Signal …, 2019 - ieeexplore.ieee.org
Once a popular theme of futuristic science fiction or far-fetched technology forecasts, digital
home assistants with a spoken language interface have become a ubiquitous commodity …

Large margin softmax loss for speaker verification

Y Liu, L He, J Liu - arXiv preprint arXiv:1904.03479, 2019 - arxiv.org
In neural network based speaker verification, speaker embedding is expected to be
discriminative between speakers while the intra-speaker distance should remain small. A …

State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and speakers in the wild evaluations

J Villalba, N Chen, D Snyder, D Garcia-Romero… - Computer Speech & …, 2020 - Elsevier
We present a thorough analysis of the systems developed by the JHU-MIT consortium in the
context of NIST speaker recognition evaluation 2018. In the previous NIST evaluation, in …