Speaker recognition for multi-speaker conversations using x-vectors

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier

Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

被引用次数：373 相关文章所有 9 个版本

[PDF] ieee.org

A survey of speaker recognition: Fundamental theories, recognition methods and opportunities

MM Kabir, MF Mridha, J Shin, I Jahan, AQ Ohi - IEEE Access, 2021 - ieeexplore.ieee.org

Humans can identify a speaker by listening to their voice, over the telephone, or on any
digital devices. Acquiring this congenital human competency, authentication technologies …

被引用次数：102 相关文章所有 4 个版本

[PDF] arxiv.org

Ecapa-tdnn: Emphasized channel attention, propagation and aggregation in tdnn based speaker verification

B Desplanques, J Thienpondt, K Demuynck - arXiv preprint arXiv …, 2020 - arxiv.org

Current speaker verification techniques rely on a neural network to extract speaker
representations. The successful x-vector architecture is a Time Delay Neural Network …

被引用次数：1355 相关文章所有 15 个版本

[PDF] arxiv.org

In defence of metric learning for speaker recognition

JS Chung, J Huh, S Mun, M Lee, HS Heo… - arXiv preprint arXiv …, 2020 - arxiv.org

The objective of this paper is' open-set'speaker recognition of unseen speakers, where ideal
embeddings should be able to condense information into a compact utterance-level …

被引用次数：489 相关文章所有 11 个版本

[PDF] arxiv.org

But system description to voxceleb speaker recognition challenge 2019

H Zeinali, S Wang, A Silnova, P Matějka… - arXiv preprint arXiv …, 2019 - arxiv.org

In this report, we describe the submission of Brno University of Technology (BUT) team to
the VoxCeleb Speaker Recognition Challenge (VoxSRC) 2019. We also provide a brief …

被引用次数：276 相关文章所有 5 个版本

[PDF] arxiv.org

Mm-vid: Advancing video understanding with gpt-4v (ision)

K Lin, F Ahmed, L Li, CC Lin, E Azarnasab… - arXiv preprint arXiv …, 2023 - arxiv.org

We present MM-VID, an integrated system that harnesses the capabilities of GPT-4V,
combined with specialized tools in vision, audio, and speech, to facilitate advanced video …

被引用次数：34 相关文章所有 2 个版本

[PDF] arxiv.org

End-to-end neural speaker diarization with self-attention

Y Fujita, N Kanda, S Horiguchi, Y Xue… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org

Speaker diarization has been mainly developed based on the clustering of speaker
embeddings. However, the clustering-based approach has two major problems; ie,(i) it is not …

被引用次数：262 相关文章所有 7 个版本

[PDF] arxiv.org

Who is real bob? adversarial attacks on speaker recognition systems

G Chen, S Chenb, L Fan, X Du, Z Zhao… - … IEEE Symposium on …, 2021 - ieeexplore.ieee.org

Speaker recognition (SR) is widely used in our daily life as a biometric authentication or
identification mechanism. The popularity of SR brings in serious security concerns, as …

被引用次数：221 相关文章所有 14 个版本

[PDF] arxiv.org

End-to-end speaker diarization for an unknown number of speakers with encoder-decoder based attractors

S Horiguchi, Y Fujita, S Watanabe, Y Xue… - arXiv preprint arXiv …, 2020 - arxiv.org

End-to-end speaker diarization for an unknown number of speakers is addressed in this
paper. Recently proposed end-to-end speaker diarization outperformed conventional …

被引用次数：184 相关文章所有 11 个版本

[PDF] arxiv.org

Backdoor attack against speaker verification

T Zhai, Y Li, Z Zhang, B Wu, Y Jiang… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Speaker verification has been widely and successfully adopted in many mission-critical
areas for user identification. The training of speaker verification requires a large amount of …

被引用次数：111 相关文章所有 5 个版本

高级搜索

QQ 群