Bird acoustic activity detection based on morphological filtering of the spectrogram

AG De Oliveira, TM Ventura, TD Ganchev… - Applied Acoustics, 2015 - Elsevier
Audio event recognition methods based on the Hidden Markov Model/Gaussian Mixture
Model (HMM/GMM) often depend on a large number of mixture components or multi-stage …

Audio parameterization with robust frame selection for improved bird identification

TM Ventura, AG de Oliveira, TD Ganchev… - Expert Systems with …, 2015 - Elsevier
A major challenge in the automated acoustic recognition of bird species is the audio
segmentation, which aims to select portions of audio that contain meaningful sound events …

Synpaflex-corpus: An expressive french audiobooks corpus dedicated to expressive speech synthesis

A Sini, D Lolive, G Vidal, M Tahon… - Proceedings of the …, 2018 - hal.science
This paper presents an expressive French audiobooks corpus containing eighty seven
hours of good audio quality speech, recorded by a single amateur speaker reading …

TUNDRA: a multilingual corpus of found data for TTS research created with light supervision

A Stan, O Watts, Y Mamiya, M Giurgiu… - … 2013, 14th Annual …, 2013 - research.ed.ac.uk
Abstract Simple4All Tundra (version 1.0) is the first release of a standardised multilingual
corpus designed for text-to-speech research with imperfect or found data. The corpus …

ALISA: An automatic lightly supervised speech segmentation and alignment tool

A Stan, Y Mamiya, J Yamagishi, P Bell, O Watts… - Computer Speech & …, 2016 - Elsevier
This paper describes the ALISA tool, which implements a lightly supervised method for
sentence-level alignment of speech with imperfect transcripts. Its intended use is to enable …

Unsupervised and lightly-supervised learning for rapid construction of TTS systems in multiple languages fromfound'data: evaluation and analysis

O Watts, A Stan, R Clark, Y Mamiya… - 8th ISCA Speech …, 2013 - research.ed.ac.uk
This paper presents techniques for building text-to-speech frontends in a way that avoids the
need for language-specific expert knowledge, but instead relies on universal resources …

A preliminary Plains Cree speech synthesizer

A Harrigan, T Mills, A Arppe - Proceedings of the Workshop …, 2019 - journals.colorado.edu
This paper discusses the development and evaluation of a Speech Synthesizer for Plains
Cree, an Algonquian language of North America. Synthesis is achieved using Simple4All …

Evaluating VAD for automatic speech recognition

S Tong, N Chen, Y Qian, K Yu - 2014 12th International …, 2014 - ieeexplore.ieee.org
Voice activity detection (VAD) plays a crucial role in speech processing, especially in
automatic speech recognition (ASR). It identifies the boundaries of the speech to be …

Characterisation and generation of expressivity in function of speaking styles for audiobook synthesis

A Sini - 2020 - theses.hal.science
In this thesis, we study the expressivity of read speech with a particular type of data, which
are audiobooks. Audiobooks are audio recordings of literary works made by professionals …

A robust FOD acoustic detection method for rocket tank final assembly process

T Lin, Y Zhu, X Zhang, K Huang, K Yan - Applied Acoustics, 2023 - Elsevier
Abstract The presence of Foreign Object Debris (FOD) during the assembly of rocket tanks
poses a significant risk to the success of rocket launches. The current manual listening …