Variable frame rate-based data augmentation to handle speaking-style variability for automatic...

D Sztahó, A Fejes - Journal of forensic sciences, 2023 - Wiley Online Library

In forensic voice comparison, deep learning has become widely popular recently. It is mainly
used to learn speaker representations, called embeddings or embedding vectors. Speaker …

被引用次数：11 相关文章所有 8 个版本

[PDF] arxiv.org

Attention-based conditioning methods using variable frame rate for style-robust speaker verification

A Afshan, A Alwan - arXiv preprint arXiv:2206.13680, 2022 - arxiv.org

We propose an approach to extract speaker embeddings that are robust to speaking style
variations in text-independent speaker verification. Typically, speaker embedding extraction …

被引用次数：3 相关文章所有 8 个版本

[PDF] arxiv.org

A principle solution for enroll-test mismatch in speaker recognition

L Li, D Wang, J Kang, R Wang, J Wu… - … /ACM Transactions on …, 2022 - ieeexplore.ieee.org

Mismatch between enrollment and test conditions causes serious performance degradation
on speaker recognition systems. This paper presents a statistics decomposition (SD) …

被引用次数：5 相关文章所有 8 个版本

[PDF] arxiv.org

Learning from human perception to improve automatic speaker verification in style-mismatched conditions

A Afshan, A Alwan - arXiv preprint arXiv:2206.13684, 2022 - arxiv.org

Our prior experiments show that humans and machines seem to employ different
approaches to speaker discrimination, especially in the presence of speaking style …

被引用次数：1 相关文章所有 7 个版本

[PDF] escholarship.org

[图书][B] Speaking style variability in speaker discrimination by humans and machines

A Afshan - 2022 - search.proquest.com

A speaker's voice constantly varies in everyday situations, such as when talking to a friend,
reading aloud, talking to pets, or narrating a happy incident. These changes in speaking …

被引用次数：2 相关文章所有 4 个版本

[PDF] escholarship.org

Accuracy and Privacy in Speech-Based Modeling of Major Depression: Innovative Approaches Through Data Augmentation, and Speaker Identity Disentanglement

V Ravi - 2024 - search.proquest.com

Abstract Major Depressive Disorder (MDD) is a prevalent mental illness that affects a
significant portion of the global population. Despite its severity, traditional diagnostic …

[图书][B] Towards Better Automatic Speech Recognition Systems for Children

Y Zhu - 2023 - search.proquest.com

This thesis aims to achieve better automatic speech recognition (ASR) for children. The most
challenging problem is a lack of transcribed available databases, and thus this problem …

[PDF][PDF] Towards understanding speaker perception and its applications to automatic speaker recognition: effects of speaking style variability

A Afshan - isca-students.org

A speaker's voice constantly varies in everyday situations such as talking to a friend, reading
aloud, talking to pets, or narrating a sad incident. These speaking style changes affect the …

高级搜索

QQ 群