Efficiently trainable text-to-speech system based on deep convolutional networks with guided attention

H Tachibana, K Uenoyama… - 2018 IEEE international …, 2018 - ieeexplore.ieee.org
This paper describes a novel text-to-speech (TTS) technique based on deep convolutional
neural networks (CNN), without use of any recurrent units. Recurrent neural networks (RNN) …

SALMA: UWB-based single-anchor localization system using multipath assistance

B Großwindhager, M Rath, J Kulmer, MS Bakr… - Proceedings of the 16th …, 2018 - dl.acm.org
Setting up indoor localization systems is often excessively time-consuming and labor-
intensive, because of the high amount of anchors to be carefully deployed or the …

A noniterative method for reconstruction of phase from STFT magnitude

Z Průša, P Balazs… - IEEE/ACM Transactions on …, 2017 - ieeexplore.ieee.org
A noniterative method for the reconstruction of the short-time fourier transform (STFT) phase
from the magnitude is presented. The method is based on the direct relationship between …

Griffin–Lim like phase recovery via alternating direction method of multipliers

Y Masuyama, K Yatabe… - IEEE Signal Processing …, 2018 - ieeexplore.ieee.org
Recovering a signal from its amplitude spectrogram, or phase recovery, exhibits many
applications in acoustic signal processing. When only an amplitude spectrogram is available …

[PDF][PDF] Neural Homomorphic Vocoder.

Z Liu, K Chen, K Yu - Interspeech, 2020 - openreview.net
In this paper, we propose the neural homomorphic vocoder (NHV), a source-filter model
based neural vocoder framework. NHV synthesizes speech by filtering impulse trains and …

On DCT-based MMSE estimation of short time spectral amplitude for single-channel speech enhancement

S Shi, K Paliwal, A Busch - Applied Acoustics, 2023 - Elsevier
Abstract This paper proposes Discrete Cosine Transform (DCT) based speech
enhancement algorithms. These algorithms utilize minimum mean square error (MMSE) …

Analytic phase features for dysarthric speech detection and intelligibility assessment

K Gurugubelli, AK Vuppala - Speech Communication, 2020 - Elsevier
The objectives of the dysarthria assessment are to discriminate dysarthric speech from
normal speech, to estimate the severity of dysarthria in terms of the dysarthric speech …

Acoustic application of phase reconstruction algorithms in optics

T Kobayashi, T Tanaka, K Yatabe… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Phase reconstruction from amplitude spectrograms has attracted attention in recent
acoustics because of its potential applications in speech synthesis and enhancement. The …

Impact of phase estimation on single-channel speech separation based on time-frequency masking

F Mayer, DS Williamson, P Mowlaee… - The Journal of the …, 2017 - pubs.aip.org
Time-frequency masking is a common solution for the single-channel source separation
(SCSS) problem where the goal is to find a time-frequency mask that separates the …

[PDF][PDF] Funnel Deep Complex U-Net for Phase-Aware Speech Enhancement.

Y Sun, L Yang, H Zhu, J Hao - Interspeech, 2021 - isca-archive.org
The emergence of deep neural networks has made speech enhancement well developed.
Most of the early models focused on estimating the magnitude of spectrum while ignoring …