On the design of automatic voice condition analysis systems. Part I: Review of concepts and an insight to the state of the art

JA Gómez-García, L Moro-Velázquez… - … Signal Processing and …, 2019 - Elsevier
This is the first of a two-part series devoted to review the current state of the art of automatic
voice condition analysis systems. The goal of this paper is to provide to the scientific …

Age and gender recognition using a convolutional neural network with a specially designed multi-attention module through speech spectrograms

A Tursunov, Mustaqeem, JY Choeh, S Kwon - Sensors, 2021 - mdpi.com
Speech signals are being used as a primary input source in human–computer interaction
(HCI) to develop several applications, such as automatic speech recognition (ASR), speech …

Multimodal age and gender estimation for adaptive human-robot interaction: A systematic literature review

HA Younis, NIR Ruhaiyem, AA Badr, AK Abdul-Hassan… - Processes, 2023 - mdpi.com
Identifying the gender of a person and his age by way of speaking is considered a crucial
task in computer vision. It is a very important and active research topic with many areas of …

Age estimation in short speech utterances based on LSTM recurrent neural networks

R Zazo, PS Nidadavolu, N Chen… - IEEE …, 2018 - ieeexplore.ieee.org
Age estimation from speech has recently received increased interest as it is useful for many
applications such as user-profiling, targeted marketing, or personalized call-routing. This …

Towards speaker age estimation with label distribution learning

S Si, J Wang, J Peng, J Xiao - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Existing methods for speaker age estimation usually treat it as a multi-class classification or
a regression problem. However, precise age identification remains a challenge due to label …

Age and gender classification from speech and face images by jointly fine-tuned deep neural networks

Z Qawaqneh, AA Mallouh, BD Barkana - Expert Systems with Applications, 2017 - Elsevier
The classification of human's age and gender from speech and face images is a challenging
task that has important applications in real-life and its applications are expected to grow …

Voxceleb enrichment for age and gender recognition

K Hechmi, TN Trong, V Hautamäki… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
VoxCeleb datasets are widely used in speaker recognition studies. Our work serves two
purposes. First, we provide speaker age labels and (an alternative) annotation of speaker …

Didispeech: A large scale mandarin speech corpus

T Guo, C Wen, D Jiang, N Luo, R Zhang… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
This paper introduces a new open-sourced Mandarin speech corpus, called DiDiSpeech. It
consists of about 800 hours of speech data at 48kHz sampling rate from 6000 speakers and …

Voice activity detection in eco‐acoustic data enables privacy protection and is a proxy for human disturbance

B Cretois, CM Rosten, SS Sethi - Methods in Ecology and …, 2022 - Wiley Online Library
Eco‐acoustic monitoring is increasingly being used to map biodiversity across large scales,
yet little thought is given to the privacy concerns and potential scientific value of …

Gender voice recognition using random forest recursive feature elimination with gradient boosting machines

K Zvarevashe, OO Olugbara - … in big data, computing and data …, 2018 - ieeexplore.ieee.org
Speech emotion recognition is a difficult task in the field of affective computing because
emotions in speech heavily depend on a variety of factors such as feeling, thought …