Deep MOS predictor for synthetic speech using cluster-based modeling

WC Huang, E Cooper, Y Tsao, HM Wang… - arXiv preprint arXiv …, 2022 - arxiv.org

We present the first edition of the VoiceMOS Challenge, a scientific event that aims to
promote the study of automatic prediction of the mean opinion score (MOS) of synthetic …

被引用次数：112 相关文章所有 9 个版本

[PDF] arxiv.org

MBNet: MOS prediction for synthesized speech with mean-bias network

Y Leng, X Tan, S Zhao, F Soong… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Mean opinion score (MOS) is a popular subjective metric to assess the quality of
synthesized speech, and usually involves multiple human judges to evaluate each speech …

被引用次数：98 相关文章所有 4 个版本

[PDF] arxiv.org

Predictions of subjective ratings and spoofing assessments of voice conversion challenge 2020 submissions

RK Das, T Kinnunen, WC Huang, Z Ling… - arXiv preprint arXiv …, 2020 - arxiv.org

The Voice Conversion Challenge 2020 is the third edition under its flagship that promotes
intra-lingual semiparallel and cross-lingual voice conversion (VC). While the primary …

被引用次数：59 相关文章所有 8 个版本

[PDF] jst.go.jp

A review on subjective and objective evaluation of synthetic speech

E Cooper, WC Huang, Y Tsao, HM Wang… - Acoustical Science …, 2024 - jstage.jst.go.jp

Evaluating synthetic speech generated by machines is a complicated process, as it involves
judging along multiple dimensions including naturalness, intelligibility, and whether the …

被引用次数：10 相关文章

[PDF] arxiv.org

SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis

G Maniati, A Vioni, N Ellinas, K Nikitaras… - arXiv preprint arXiv …, 2022 - arxiv.org

In this work, we present the SOMOS dataset, the first large-scale mean opinion scores
(MOS) dataset consisting of solely neural text-to-speech (TTS) samples. It can be employed …

被引用次数：25 相关文章所有 8 个版本

[PDF] arxiv.org

Neural mos prediction for synthesized speech using multi-task learning with spoofing detection and spoofing type classification

Y Choi, Y Jung, H Kim - 2021 IEEE Spoken Language …, 2021 - ieeexplore.ieee.org

Several studies have proposed deep-learning-based models to predict the mean opinion
score (MOS) of synthesized speech, showing the possibility of replacing human raters …

被引用次数：29 相关文章所有 6 个版本

[PDF] isca-archive.org

[PDF][PDF] DeePMOS: deep posterior mean-opinion-score of speech

X Liang, F Cumlin, C Schüldt… - Proceedings of …, 2023 - isca-archive.org

We propose a deep neural network (DNN) based method that provides a posterior
distribution of mean-opinion-score (MOS) for an input speech signal. The DNN outputs …

被引用次数：5 相关文章所有 4 个版本

Non-Intrusive Speech Quality Assessment Based on Deep Neural Networks for Speech Communication

M Liu, J Wang, F Wang, F Xiang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Traditionally, speech quality evaluation relies on subjective assessments or intrusive
methods that require reference signals or additional equipment. However, over recent years …

被引用次数：3 相关文章所有 3 个版本

[PDF] isca-archive.org

[PDF][PDF] Confidence Intervals for ASR-Based TTS Evaluation.

J Taylor, K Richmond - Interspeech, 2021 - isca-archive.org

Automatic speech recognition (ASR) is increasingly used to evaluate the intelligibility of text-
to-speech synthesis (TTS). ASR is less costly than traditional listening tests, but questions …

被引用次数：15 相关文章所有 5 个版本

[PDF] isca-archive.org

[PDF][PDF] BIT-MI Deep Learning-based Model to Non-intrusive Speech Quality Assessment Challenge in Online Conferencing Applications.

M Liu, J Wang, L Xu, J Zhang, S Li, F Xiang - INTERSPEECH, 2022 - isca-archive.org

This paper presents the details of the BIT-MI deep learningbased model submitted to the
ConferencingSpeech challenge 2022. Due to the large time and labor costs of subjective …

被引用次数：5 相关文章所有 5 个版本

高级搜索

QQ 群