Speaker identification and clustering using convolutional neural networks

Y Lukic, C Vogt, O Dürr… - 2016 IEEE 26th …, 2016 - ieeexplore.ieee.org
Deep learning, especially in the form of convolutional neural networks (CNNs), has triggered
substantial improvements in computer vision and related fields in recent years. This …

Active learning based constrained clustering for speaker diarization

C Yu, JHL Hansen - IEEE/ACM Transactions on Audio, Speech …, 2017 - ieeexplore.ieee.org
Most speaker diarization research has focused on unsupervised scenarios, where no
human supervision is available. However, in many real-world applications, a certain amount …

A novel LSTM-based speech preprocessor for speaker diarization in realistic mismatch conditions

L Sun, J Du, T Gao, YD Lu, Y Tsao… - … , Speech and Signal …, 2018 - ieeexplore.ieee.org
In this study, we investigate on the effects of deep learning based speech enhancement as a
preprocessor to speaker diarization in quite challenging realistic environments involving the …

Novel architectures for unsupervised information bottleneck based speaker diarization of meetings

N Dawalatabad, S Madikeri, CC Sekhar… - … /ACM Transactions on …, 2020 - ieeexplore.ieee.org
Speaker diarization is an important problem that is topical, and is especially useful as a
preprocessor for conversational speech related applications. The objective of this article is …

Overo: Sharing Private Audio Recordings

J Lim, K Kim, H Yu, SB Lee - Proceedings of the 2022 ACM SIGSAC …, 2022 - dl.acm.org
The use of smartphones as voice recorders has made it easy to record audios as proof of
conversations, but sharing of such audio evidence incurs speech and voice privacy risks …

Speech refinement using Bi-LSTM and improved spectral clustering in speaker diarization

A Gupta, A Purwar - Multimedia Tools and Applications, 2024 - Springer
In this digitally-driven culture, the need and demand for diarizing online meetings, classes,
conferences, and medical diagnoses have increased a lot. Speaker Diarization, a sub …

Spatial features selection for unsupervised speaker segmentation and clustering

B Martínez-González, JM Pardo… - Expert Systems with …, 2017 - Elsevier
The selection of the best features to be used in expert systems is a key issue in obtaining a
satisfactory performance. Unsupervised speaker segmentation and clustering is the task of …

Supervised speaker diarization using random forests: a tool for psychotherapy process research

L Fürer, N Schenk, V Roth, M Steppan… - Frontiers in …, 2020 - frontiersin.org
Speaker diarization is the practice of determining who speaks when in audio recordings.
Psychotherapy research often relies on labor intensive manual diarization. Unsupervised …

Robust speaker clustering using mixtures of von mises-fisher distributions for naturalistic audio streams

H Dubey, A Sangwan, JHL Hansen - arXiv preprint arXiv:1808.06045, 2018 - arxiv.org
Speaker Diarization (ie determining who spoke and when?) for multi-speaker naturalistic
interactions such as Peer-Led Team Learning (PLTL) sessions is a challenging task. In this …

Unsupervised classification of speaker roles in multi-participant conversational speech

Y Li, Q Wang, X Zhang, W Li, X Li, J Yang… - Computer Speech & …, 2017 - Elsevier
This paper proposes an unsupervised method for analyzing speaker roles in multi-
participant conversational speech. First, features for characterizing the differences of various …