The voicemos challenge 2022

WC Huang, E Cooper, Y Tsao, HM Wang… - arXiv preprint arXiv …, 2022 - arxiv.org
We present the first edition of the VoiceMOS Challenge, a scientific event that aims to
promote the study of automatic prediction of the mean opinion score (MOS) of synthetic …

MBNet: MOS prediction for synthesized speech with mean-bias network

Y Leng, X Tan, S Zhao, F Soong… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Mean opinion score (MOS) is a popular subjective metric to assess the quality of
synthesized speech, and usually involves multiple human judges to evaluate each speech …

Predictions of subjective ratings and spoofing assessments of voice conversion challenge 2020 submissions

RK Das, T Kinnunen, WC Huang, Z Ling… - arXiv preprint arXiv …, 2020 - arxiv.org
The Voice Conversion Challenge 2020 is the third edition under its flagship that promotes
intra-lingual semiparallel and cross-lingual voice conversion (VC). While the primary …

A review on subjective and objective evaluation of synthetic speech

E Cooper, WC Huang, Y Tsao, HM Wang… - Acoustical Science …, 2024 - jstage.jst.go.jp
Evaluating synthetic speech generated by machines is a complicated process, as it involves
judging along multiple dimensions including naturalness, intelligibility, and whether the …

SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis

G Maniati, A Vioni, N Ellinas, K Nikitaras… - arXiv preprint arXiv …, 2022 - arxiv.org
In this work, we present the SOMOS dataset, the first large-scale mean opinion scores
(MOS) dataset consisting of solely neural text-to-speech (TTS) samples. It can be employed …

Neural mos prediction for synthesized speech using multi-task learning with spoofing detection and spoofing type classification

Y Choi, Y Jung, H Kim - 2021 IEEE Spoken Language …, 2021 - ieeexplore.ieee.org
Several studies have proposed deep-learning-based models to predict the mean opinion
score (MOS) of synthesized speech, showing the possibility of replacing human raters …

[PDF][PDF] DeePMOS: deep posterior mean-opinion-score of speech

X Liang, F Cumlin, C Schüldt… - Proceedings of …, 2023 - isca-archive.org
We propose a deep neural network (DNN) based method that provides a posterior
distribution of mean-opinion-score (MOS) for an input speech signal. The DNN outputs …

Non-Intrusive Speech Quality Assessment Based on Deep Neural Networks for Speech Communication

M Liu, J Wang, F Wang, F Xiang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Traditionally, speech quality evaluation relies on subjective assessments or intrusive
methods that require reference signals or additional equipment. However, over recent years …

[PDF][PDF] Confidence Intervals for ASR-Based TTS Evaluation.

J Taylor, K Richmond - Interspeech, 2021 - isca-archive.org
Automatic speech recognition (ASR) is increasingly used to evaluate the intelligibility of text-
to-speech synthesis (TTS). ASR is less costly than traditional listening tests, but questions …

[PDF][PDF] BIT-MI Deep Learning-based Model to Non-intrusive Speech Quality Assessment Challenge in Online Conferencing Applications.

M Liu, J Wang, L Xu, J Zhang, S Li, F Xiang - INTERSPEECH, 2022 - isca-archive.org
This paper presents the details of the BIT-MI deep learningbased model submitted to the
ConferencingSpeech challenge 2022. Due to the large time and labor costs of subjective …