SPTK4: An open-source software toolkit for speech signal processing

FIRNet: Fundamental Frequency Controllable Fast Neural Vocoder With Trainable Finite Impulse Response Filter

Y Ohtani, T Okamoto, T Toda… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org

Some neural vocoders with fundamental frequency (f 0) control have succeeded in
performing real-time inference on a single CPU while preserving the quality of the synthetic …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Sifisinger: A High-Fidelity End-to-End Singing Voice Synthesizer Based on Source-Filter Model

J Cui, Y Gu, C Weng, J Zhang… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

This paper presents an advanced end-to-end singing voice synthesis (SVS) system based
on the source-filter mechanism that directly translates lyrical and melodic cues into …

被引用次数：2 相关文章

[PDF] ieee.org

Performance of Text-Independent Automatic Speaker Recognition on a Multicore System

R Kouatly, TA Khan - Tsinghua Science and Technology, 2023 - ieeexplore.ieee.org

This paper studies a high-speed text-independent Automatic Speaker Recognition (ASR)
algorithm based on a multicore system's Gaussian Mixture Model (GMM). The high speech …

相关文章所有 2 个版本

[PDF] arxiv.org

HiFi-Glot: Neural Formant Synthesis with Differentiable Resonant Filters

L Juvela, PP Zarazaga, GE Henter, Z Malisz - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce an end-to-end neural speech synthesis system that uses the source-filter
model of speech production. Specifically, we apply differentiable resonant filters to a glottal …

相关文章所有 2 个版本

[PDF] arxiv.org

Differentiable Time-Varying Linear Prediction in the Context of End-to-End Analysis-by-Synthesis

CY Yu, G Fazekas - arXiv preprint arXiv:2406.05128, 2024 - arxiv.org

Training the linear prediction (LP) operator end-to-end for audio synthesis in modern deep
learning frameworks is slow due to its recursive formulation. In addition, frame-wise …

被引用次数：1 相关文章所有 2 个版本

[HTML] ismir.net

[HTML][HTML] GOLF: A Singing Voice Synthesiser with Glottal Flow Wavetables and LPC Filters

CY Yu, G Fazekas - … of the International Society for Music …, 2024 - transactions.ismir.net

This paper introduces GlOttal‑flow LPC Filter (GOLF), a novel method for singing voice
synthesis (SVS) that exploits the physical characteristics of the human voice using …

相关文章所有 2 个版本

Teaching Speech Signal Processing Fundamentals in Undergraduate Class Using an Interactive GUI

R Rajan, ARA Mahadev, P Arjun… - 2024 32nd European …, 2024 - ieeexplore.ieee.org

This paper introduces an interactive GUI to teach speech signal processing fundamentals to
undergraduate students. Traditional teaching methods often struggle to convey complex …

Incorporating Cumulative Mean Normalized Difference Function Towards Intepretable Monophonic Singing Voice Pitch Extraction

X Li, C He - 2024 9th International Conference on Intelligent …, 2024 - ieeexplore.ieee.org

Pitch estimation plays an important role in various music processing and music information
retrieval applications. The traditional methods for pitch estimation contain rich prior …

相关文章所有 2 个版本

[PDF] core.ac.uk

[PDF][PDF] Speech wave-form Driven Motion Synthesis For Embodied Agents

JH Lu - 2023 - core.ac.uk

The main objective of this thesis is to synthesise motion from speech, especially in
conversation. Based on previous research into different acoustic features or the combination …

相关文章所有 2 个版本

高级搜索

QQ 群