D Berghi, PJB Jackson - IEEE/ACM Transactions on Audio, Speech, and …, 2024 - dl.acm.org
Conventional audio-visual approaches for active speaker detection (ASD) typically rely on
visually pre-extracted face tracks and the corresponding single-channel audio to find the …