Autovocoder: Fast waveform generation from a learned speech representation using differentiable...

B Hayes, J Shier, G Fazekas, A McPherson… - Frontiers in Signal …, 2024 - frontiersin.org

The term “differentiable digital signal processing” describes a family of techniques in which
loss function gradients are backpropagated through digital signal processors, facilitating …

被引用次数：27 相关文章所有 6 个版本

[PDF] mtak.hu

Speech synthesis from intracranial stereotactic Electroencephalography using a neural vocoder

FV Arthur, TG Csapó - INFOCOMMUNICATIONS JOURNAL: A …, 2024 - real.mtak.hu

Speech is one of the most important human biosignals. However, only some speech
production characteristics are fully understood, which are required for a successful speech …

被引用次数：5 相关文章所有 3 个版本

[PDF] mdpi.com

A Smart Control System for the Oil Industry Using Text-to-Speech Synthesis Based on IIoT

AR Mandeel, AA Aggar, MS Al-Radhi, TG Csapó - Electronics, 2023 - mdpi.com

Oil refineries have high operating expenses and are often exposed to increased asset
integrity risks and functional failure. Real-time monitoring of their operations has always …

被引用次数：2 相关文章所有 4 个版本

[PDF] arxiv.org

Signal Reconstruction from Mel-Spectrogram Based on Bi-Level Consistency of Full-Band Magnitude and Phase

Y Masuyama, N Ueno, N Ono - 2023 IEEE Workshop on …, 2023 - ieeexplore.ieee.org

We propose an optimization-based method for reconstructing a time-domain signal from a
low-dimensional spectral representation such as a mel-spectrogram. Phase reconstruction …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation

HP Du, YX Lu, Y Ai, ZH Ling - arXiv preprint arXiv:2406.02162, 2024 - arxiv.org

This paper proposes a novel bidirectional neural vocoder, named BiVocoder, capable both
of feature extraction and reverse waveform generation within the short-time Fourier …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Puffin: Pitch-synchronous neural waveform generation for fullband speech on modest devices

O Watts, L Wihlborg… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

We present a neural vocoder designed with low-powered Alternative and Augmentative
Communication devices in mind. By combining elements of successful modern vocoders …

被引用次数：3 相关文章所有 4 个版本

ChildTinyTalks (CTT): A Benchmark Dataset and Baseline for Expressive Child Speech Synthesis

S Alwaisi, MS Al-Radhi, G Németh - International Conference on Speech …, 2024 - Springer

Designing expressive speech synthesis for child voice remains an unresolved problem. One
of the major dilemmas faced by child TTS systems and child speech synthesis is the scarcity …

Automated Child Voice Generation: Methodology and Implementation

S Alwaisi, MS Al-Radhi… - … Conference on Speech …, 2023 - ieeexplore.ieee.org

Significant progress has been made in the development of text-to-speech (TTS) models;
however, synthesizing child speech remains a challenging task. Limited research has been …

被引用次数：2 相关文章

[PDF] arxiv.org

FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs

W Jang, D Lim, H Park - arXiv preprint arXiv:2305.10823, 2023 - arxiv.org

This paper presents FastFit, a novel neural vocoder architecture that replaces the U-Net
encoder with multiple short-time Fourier transforms (STFTs) to achieve faster generation …

被引用次数：2 相关文章所有 5 个版本

[PDF] cyberrus.info

[PDF][PDF] БЫСТРЫЙ СИНТЕЗ АУДИОСИГНАЛОВ ПО ИЗОБРАЖЕНИЯМ СПЕКТРОГРАММ В ЗАДАЧАХ ЗАЩИТЫ РЕЧЕВОЙ ИНФОРМАЦИИ

СВ Дворянкин, НС Дворянкин… - Вопросы …, 2024 - cyberrus.info

Научная новизна: предложен новый метод инверсии спектрограмм на основе
рассечения-разнесения образа исходной спектрограммы для получения более точных …

高级搜索

QQ 群