Convolutional Recurrent Neural Networks for Speech Activity Detection in Naturalistic Audio...

Unsupervised representation learning for speech activity detection in the fearless steps challenge 2021

P Gimeno, A Ortega, A Miguel, E Lleida - Interspeech 2021, 2021 - hal.science

In this paper, we describe the ViVoLab speech activity detection (SAD) system submitted to
the Fearless steps Challengephase III. This series of challenges have proposed a number of …

被引用次数：6 相关文章所有 6 个版本

[PDF] mdpi.com

Multimodal diarization systems by training enrollment models as identity representations

V Mingote, I Viñals, P Gimeno, A Miguel, A Ortega… - Applied Sciences, 2022 - mdpi.com

This paper describes a post-evaluation analysis of the system developed by ViVoLAB
research group for the IberSPEECH-RTVE 2020 Multimodal Diarization (MD) Challenge …

被引用次数：4 相关文章所有 9 个版本

[PDF] mdpi.com

The domain mismatch problem in the broadcast speaker attribution task

I Viñals, A Ortega, A Miguel, E Lleida - Applied Sciences, 2021 - mdpi.com

The demand of high-quality metadata for the available multimedia content requires the
development of new techniques able to correctly identify more and more information …

被引用次数：3 相关文章所有 8 个版本

[PDF] isca-archive.org

[PDF][PDF] Advances in Binary and Multiclass Audio Segmentation with Deep Learning Techniques: A PhD Thesis Overview

P Gimeno, A Ortega - Proc. IberSPEECH 2024, 2024 - isca-archive.org

Advances in technology have increased multimedia data generation, making manual
analysis impractical and driving the need for automatic tools, often based on deep learning …

Unsupervised adaptation of deep speech activity detection models to unseen domains

P Gimeno, D Ribas, A Ortega, A Miguel, E Lleida - Applied Sciences, 2022 - mdpi.com

Speech Activity Detection (SAD) aims to accurately classify audio fragments containing
human speech. Current state-of-the-art systems for the SAD task are mainly based on deep …

被引用次数：3 相关文章所有 10 个版本

[PDF] researchgate.net

[PDF] arxiv.org

EML Online Speech Activity Detection for the Fearless Steps Challenge Phase-III

O Ghahabi, V Fischer - arXiv preprint arXiv:2106.11075, 2021 - arxiv.org

Speech Activity Detection (SAD), locating speech segments within an audio recording, is a
main part of most speech technology applications. Robust SAD is usually more difficult in …

高级搜索

QQ 群