Lightly supervised GMM VAD to use audiobook for speech synthesiser

AG De Oliveira, TM Ventura, TD Ganchev… - Applied Acoustics, 2015 - Elsevier

Audio event recognition methods based on the Hidden Markov Model/Gaussian Mixture
Model (HMM/GMM) often depend on a large number of mixture components or multi-stage …

被引用次数：82 相关文章所有 4 个版本

[PDF] researchgate.net

Audio parameterization with robust frame selection for improved bird identification

TM Ventura, AG de Oliveira, TD Ganchev… - Expert Systems with …, 2015 - Elsevier

A major challenge in the automated acoustic recognition of bird species is the audio
segmentation, which aims to select portions of audio that contain meaningful sound events …

被引用次数：53 相关文章所有 5 个版本

[PDF] hal.science

Synpaflex-corpus: An expressive french audiobooks corpus dedicated to expressive speech synthesis

A Sini, D Lolive, G Vidal, M Tahon… - Proceedings of the …, 2018 - hal.science

This paper presents an expressive French audiobooks corpus containing eighty seven
hours of good audio quality speech, recorded by a single amateur speaker reading …

被引用次数：34 相关文章所有 10 个版本

[PDF] ed.ac.uk

TUNDRA: a multilingual corpus of found data for TTS research created with light supervision

A Stan, O Watts, Y Mamiya, M Giurgiu… - … 2013, 14th Annual …, 2013 - research.ed.ac.uk

Abstract Simple4All Tundra (version 1.0) is the first release of a standardised multilingual
corpus designed for text-to-speech research with imperfect or found data. The corpus …

被引用次数：54 相关文章所有 17 个版本

[PDF] google.com

ALISA: An automatic lightly supervised speech segmentation and alignment tool

A Stan, Y Mamiya, J Yamagishi, P Bell, O Watts… - Computer Speech & …, 2016 - Elsevier

This paper describes the ALISA tool, which implements a lightly supervised method for
sentence-level alignment of speech with imperfect transcripts. Its intended use is to enable …

被引用次数：41 相关文章所有 7 个版本

[PDF] ed.ac.uk

Unsupervised and lightly-supervised learning for rapid construction of TTS systems in multiple languages fromfound'data: evaluation and analysis

O Watts, A Stan, R Clark, Y Mamiya… - 8th ISCA Speech …, 2013 - research.ed.ac.uk

This paper presents techniques for building text-to-speech frontends in a way that avoids the
need for language-specific expert knowledge, but instead relies on universal resources …

被引用次数：38 相关文章所有 16 个版本

[PDF] colorado.edu

A preliminary Plains Cree speech synthesizer

A Harrigan, T Mills, A Arppe - Proceedings of the Workshop …, 2019 - journals.colorado.edu

This paper discusses the development and evaluation of a Speech Synthesizer for Plains
Cree, an Algonquian language of North America. Synthesis is achieved using Simple4All …

被引用次数：16 相关文章所有 13 个版本

Evaluating VAD for automatic speech recognition

S Tong, N Chen, Y Qian, K Yu - 2014 12th International …, 2014 - ieeexplore.ieee.org

Voice activity detection (VAD) plays a crucial role in speech processing, especially in
automatic speech recognition (ASR). It identifies the boundaries of the speech to be …

被引用次数：21 相关文章所有 2 个版本

[PDF] hal.science

Characterisation and generation of expressivity in function of speaking styles for audiobook synthesis

A Sini - 2020 - theses.hal.science

In this thesis, we study the expressivity of read speech with a particular type of data, which
are audiobooks. Audiobooks are audio recordings of literary works made by professionals …

被引用次数：9 相关文章所有 4 个版本

A robust FOD acoustic detection method for rocket tank final assembly process

T Lin, Y Zhu, X Zhang, K Huang, K Yan - Applied Acoustics, 2023 - Elsevier

Abstract The presence of Foreign Object Debris (FOD) during the assembly of rocket tanks
poses a significant risk to the success of rocket launches. The current manual listening …

高级搜索

QQ 群