[PDF][PDF] MagNetO: X-vector Magnitude Estimation Network plus Offset for Improved Speaker Recognition.

D Garcia-Romero, G Sell, A Mccree - Odyssey, 2020 - isca-archive.org
We present a magnitude estimation network that is combined with a modified ResNet x-
vector system to generate embeddings whose inner product is able to produce calibrated …

Child-adult speech diarization in naturalistic conditions of preschool classrooms using room-independent ResNet model and automatic speech recognition-based re …

PV Kothalkar, JHL Hansen, D Irvin… - The Journal of the …, 2024 - pubs.aip.org
Speech and language development are early indicators of overall analytical and learning
ability in children. The preschool classroom is a rich language environment for monitoring …

Deep feature cyclegans: Speaker identity preserving non-parallel microphone-telephone domain adaptation for speaker verification

S Kataria, J Villalba, P Żelasko… - arXiv preprint arXiv …, 2021 - arxiv.org
With the increase in the availability of speech from varied domains, it is imperative to use
such out-of-domain data to improve existing speech systems. Domain adaptation is a …

Joint domain adaptation and speech bandwidth extension using time-domain GANs for speaker verification

S Kataria, J Villalba, L Moro-Velázquez… - arXiv preprint arXiv …, 2022 - arxiv.org
Speech systems developed for a particular choice of acoustic domain and sampling
frequency do not translate easily to others. The usual practice is to learn domain adaptation …

[PDF][PDF] Knowledge Distillation-based Approaches for Adult-child Communication Assessment Using Speech Processing in Naturalistic Preschool Learning Spaces

PV Kothalkar, JHL Hansen, CA Busso-Recabarren… - 2024 - utd-ir.tdl.org
In Chapter 1, we looked at the motivation for automated speech processing of daylong audio
files with application of feedback to teachers about their interactions with children. The most …