Multi-resolution multi-head attention in deep speaker embedding

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier

Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

被引用次数：439 相关文章所有 9 个版本

Deep speaker embeddings for Speaker Verification: Review and experimental comparison

M Jakubec, R Jarina, E Lieskovska, P Kasak - Engineering Applications of …, 2024 - Elsevier

The construction of speaker-specific acoustic models for automatic speaker recognition is
almost exclusively based on deep neural network-based speaker embeddings. This work …

被引用次数：21 相关文章所有 2 个版本

[PDF] arxiv.org

Multi-view self-attention based transformer for speaker recognition

R Wang, J Ao, L Zhou, S Liu, Z Wei, T Ko… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Initially developed for natural language processing (NLP), Transformer model is now widely
used for speech processing tasks such as speaker recognition, due to its powerful sequence …

被引用次数：50 相关文章所有 5 个版本

MEConformer: Highly representative embedding extractor for speaker verification via incorporating selective convolution into deep speaker encoder

Q Zheng, Z Chen, Z Wang, H Liu, M Lin - Expert Systems with Applications, 2024 - Elsevier

Transformer models have demonstrated superior performance across various domains,
including computer vision, natural language processing, and speech recognition. The …

被引用次数：13 相关文章

[PDF] arxiv.org

Audio deepfake detection system with neural stitching for add 2022

R Yan, C Wen, S Zhou, T Guo… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

This paper describes our best system and methodology for ADD 2022: The First Audio Deep
Synthesis Detection Challenge [1]. The very same system was used for both two rounds of …

被引用次数：25 相关文章所有 3 个版本

[PDF] xiaolei-zhang.net

[PDF][PDF] Branch-ECAPA-TDNN: A parallel branch architecture to capture local and global features for speaker verification

J Yao, C Liang, Z Peng, B Zhang… - Proc. of …, 2023 - xiaolei-zhang.net

Currently, ECAPA-TDNN is one of the state-of-the-art deep models for automatic speaker
verification (ASV). However, it focuses too much on local feature extraction with fixed local …

被引用次数：15 相关文章所有 3 个版本

[HTML] sciencedirect.com

[HTML][HTML] Causal reasoning for algorithmic fairness in voice controlled cyber-physical systems

G Fenu, M Marras, G Medda, G Meloni - Pattern Recognition Letters, 2023 - Elsevier

Automated speaker recognition is enabling personalized interactions with the voice-based
interfaces and assistants part of the modern cyber-physical-social systems. Prior studies …

被引用次数：5 相关文章所有 3 个版本

Time-domain speaker verification using temporal convolutional networks

S Han, J Byun, JW Shin - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

Recently, speaker verification systems using deep neural networks have been widely
studied. Many of them utilize hand-crafted features such as mel-filterbank energies, mel …

被引用次数：15 相关文章

[PDF] mdpi.com

Global–local self-attention based transformer for speaker verification

F Xie, D Zhang, C Liu - Applied Sciences, 2022 - mdpi.com

Transformer models are now widely used for speech processing tasks due to their powerful
sequence modeling capabilities. Previous work determined an efficient way to model …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Dictionary attacks on speaker verification

M Marras, P Korus, A Jain… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

In this paper, we propose dictionary attacks against speaker verification-a novel attack
vector that aims to match a large fraction of speaker population by chance. We introduce a …

被引用次数：8 相关文章所有 6 个版本

高级搜索

QQ 群