Wav2sql: Direct generalizable speech-to-sql parsing

文章

学术资源搜索

获得 2 条结果（用时0.02秒）

我的图书馆

Wav2sql: Direct generalizable speech-to-sql parsing

在引用文章中搜索

[PDF] arxiv.org

Vit-tts: visual text-to-speech with scalable diffusion transformer

H Liu, R Huang, X Lin, W Xu, M Zheng, H Chen… - arXiv preprint arXiv …, 2023 - arxiv.org

Text-to-speech (TTS) has undergone remarkable improvements in performance, particularly
with the advent of Denoising Diffusion Probabilistic Models (DDPMs). However, the …

被引用次数：6 相关文章所有 6 个版本

[PDF] arxiv.org

TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation

X Cheng, R Huang, L Li, T Jin, Z Wang, A Yin… - arXiv preprint arXiv …, 2023 - arxiv.org

Direct speech-to-speech translation achieves high-quality results through the introduction of
discrete units obtained from self-supervised learning. This approach circumvents delays and …

被引用次数：1 相关文章所有 3 个版本

高级搜索

QQ 群

Wav2sql: Direct generalizable speech-to-sql parsing

Vit-tts: visual text-to-speech with scalable diffusion transformer

TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation

引用