The third DIHARD diarization challenge

N Ryant, P Singh, V Krishnamohan, R Varma… - arXiv preprint arXiv …, 2020 - arxiv.org
DIHARD III was the third in a series of speaker diarization challenges intended to improve
the robustness of diarization systems to variability in recording equipment, noise conditions …

[HTML][HTML] An experimental review of speaker diarization methods with application to two-speaker conversational telephone speech recordings

L Serafini, S Cornell, G Morrone, E Zovato… - Computer Speech & …, 2023 - Elsevier
We performed an experimental review of current diarization systems for the conversational
telephone speech (CTS) domain. In detail, we considered a total of eight different algorithms …

Diaper: End-to-end neural diarization with perceiver-based attractors

F Landini, T Stafylakis, L Burget - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
Until recently, the field of speaker diarization was dominated by cascaded systems. Due to
their limitations, mainly regarding overlapped speech and cumbersome pipelines, end-to …

Encoder-decoder based attractors for end-to-end neural diarization

S Horiguchi, Y Fujita, S Watanabe… - … /ACM Transactions on …, 2022 - ieeexplore.ieee.org
This paper investigates an end-to-end neural diarization (EEND) method for an unknown
number of speakers. In contrast to the conventional cascaded approach to speaker …

Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech

K Kinoshita, M Delcroix, N Tawara - arXiv preprint arXiv:2105.09040, 2021 - arxiv.org
Recently, we proposed a novel speaker diarization method called End-to-End-Neural-
Diarization-vector clustering (EEND-vector clustering) that integrates clustering-based and …

Target-speaker voice activity detection via sequence-to-sequence prediction

M Cheng, W Wang, Y Zhang, X Qin… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Target-speaker voice activity detection is currently a promising approach for speaker
diarization in complex acoustic environments. This paper presents a novel Sequence-to …

Towards neural diarization for unlimited numbers of speakers using global and local attractors

S Horiguchi, S Watanabe, P García… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
Attractor-based end-to-end diarization is achieving comparable accuracy to the carefully
tuned conventional clustering-based methods on challenging datasets. However, the main …

From simulated mixtures to simulated conversations as training data for end-to-end neural diarization

F Landini, A Lozano-Diez, M Diez, L Burget - arXiv preprint arXiv …, 2022 - arxiv.org
End-to-end neural diarization (EEND) is nowadays one of the most prominent research
topics in speaker diarization. EEND presents an attractive alternative to standard cascaded …

Ansd-ma-mse: Adaptive neural speaker diarization using memory-aware multi-speaker embedding

MK He, J Du, QF Liu, CH Lee - IEEE/ACM Transactions on …, 2023 - ieeexplore.ieee.org
In this paper, we propose a neural speaker diarization (NSD) network architecture consisting
of three key components. First, a memory-aware multi-speaker embedding (MA-MSE) …

EEND-SS: Joint end-to-end neural speaker diarization and speech separation for flexible number of speakers

S Maiti, Y Ueda, S Watanabe, C Zhang… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
In this paper, we present a novel framework that jointly performs three tasks: speaker
diarization, speech separation, and speaker counting. Our proposed framework integrates …