Active speaker detection is an important component in video analysis algorithms for applications such as speaker diarization, video re-targeting for meetings, speech …
Current methods for active speaker detection focus on modeling audiovisual information from a single speaker. This strategy can be adequate for addressing single-speaker …
JL Alcázar, F Caba, AK Thabet… - Proceedings of the …, 2021 - openaccess.thecvf.com
Active speaker detection requires a solid integration of multi-modal cues. While individual modalities can approximate a solution, accurate predictions can only be achieved by …
We present an automatic voice activity detection (VAD) method that is solely based on visual cues. Unlike traditional approaches processing audio, we show that upper body motion …
D Berghi, PJB Jackson - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
Conventional audio-visual approaches for active speaker detection (ASD) typically rely on visually pre-extracted face tracks and the corresponding single-channel audio to find the …
We investigated the mouth-opening transition pattern (MOTP), which represents the change of mouth-opening degree during the end of an utterance, and used it to predict the next …
The goal of this work is to synchronise audio and video of a talking face using deep neural network models. Existing works have trained networks on proxy tasks such as cross-modal …
We present a novel vision-based voice activity detection (VAD) method that relies only on automatic upper body motion (UBM) analysis. Traditionally, VAD is performed using audio …
L Pibre, F Madrigal, C Equoy, F Lerasle… - Multimedia Tools and …, 2023 - Springer
Meetings are a common activity in professional contexts, and it remains challenging to endow vocal assistants with advanced functionalities to facilitate meeting management. In …