End-to-end neural speaker diarization with self-attention

Y Fujita, N Kanda, S Horiguchi, Y Xue… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org
Speaker diarization has been mainly developed based on the clustering of speaker
embeddings. However, the clustering-based approach has two major problems; ie,(i) it is not …

Speaker diarization with LSTM

Q Wang, C Downey, L Wan… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org
For many years, i-vector based audio embedding techniques were the dominant approach
for speaker verification and speaker diarization applications. However, mirroring the rise of …

End-to-end neural speaker diarization with permutation-free objectives

Y Fujita, N Kanda, S Horiguchi, K Nagamatsu… - arXiv preprint arXiv …, 2019 - arxiv.org
In this paper, we propose a novel end-to-end neural-network-based speaker diarization
method. Unlike most existing methods, our proposed method does not have separate …

Fully supervised speaker diarization

A Zhang, Q Wang, Z Zhu, J Paisley… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
In this paper, we propose a fully supervised speaker diarization approach, named
unbounded interleaved-state recurrent neural networks (UIS-RNN). Given extracted speaker …

Turn-to-diarize: Online speaker diarization constrained by transformer transducer speaker turn detection

W Xia, H Lu, Q Wang, A Tripathi… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In this paper, we present a novel speaker diarization system for streaming on-device
applications. In this system, we use a transformer transducer to detect the speaker turns …

End-to-end neural diarization: Reformulating speaker diarization as simple multi-label classification

Y Fujita, S Watanabe, S Horiguchi, Y Xue… - arXiv preprint arXiv …, 2020 - arxiv.org
The most common approach to speaker diarization is clustering of speaker embeddings.
However, the clustering-based approach has a number of problems; ie,(i) it is not optimized …

Supervised online diarization with sample mean loss for multi-domain data

E Fini, A Brutti - … 2020-2020 IEEE International Conference on …, 2020 - ieeexplore.ieee.org
Recently, a fully supervised speaker diarization approach was proposed (UIS-RNN) which
models speakers using multiple instances of a parameter-sharing recurrent neural network …

Meta-learning with latent space clustering in generative adversarial network for speaker diarization

M Pal, M Kumar, R Peri, TJ Park, SH Kim… - … ACM transactions on …, 2021 - ieeexplore.ieee.org
The performance of most speaker diarization systems with x-vector embeddings is both
vulnerable to noisy environments and lacks domain robustness. Earlier work on speaker …

Incremental face clustering with optimal summary learning via graph convolutional network

X Zhao, Z Wang, L Gao, Y Li… - Tsinghua Science and …, 2021 - ieeexplore.ieee.org
In this study, we address the problems encountered by incremental face clustering. Without
the benefit of having observed the entire data distribution, incremental face clustering is …

Regularized spectral methods for clustering signed networks

M Cucuringu, AV Singh, D Sulem, H Tyagi - Journal of Machine Learning …, 2021 - jmlr.org
We study the problem of k-way clustering in signed graphs. Considerable attention in recent
years has been devoted to analyzing and modeling signed graphs, where the affinity …