Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization

P Pálka, F Landini, D Klement, M Diez… - arXiv preprint arXiv …, 2024 - arxiv.org
In spite of the popularity of end-to-end diarization systems nowadays, modular systems
comprised of voice activity detection (VAD), speaker embedding extraction plus clustering …

[PDF][PDF] DIHARD-CPqD-Hybrid System

VAM Filho, DA Silva, LGD Cuozzo - dihardchallenge.github.io
The CPqD hybrid diarization system was comprised of the LIUM diarization toolkit using data-
driven, neural network speech activity detection (SAD). The LIUM diarization system is a four …

[PDF][PDF] DIHARD-CPqD-System B1/B2

VAM Filho, DA Silva, LGD Cuozzo - dihardchallenge.github.io
1. Abstract The B1/B2 CPqD diarization system is a combination of two different neural
networks, both with joint optimization for speaker embedding learning, speech activity and …

[PDF][PDF] Joint Discriminative Embedding Learning, Speech Activity and Overlap Detection for the DIHARD Speaker Diarization Challenge

VAM Filho, DA Silva, LGD Cuozzo - isca-archive.org
The DIHARD is a new, annual speaker diarization challenge focusing on “hard” domains, ie
datasets in which current stateof-the-art systems are expected to perform poorly. We present …