关注
Ziyue Jiang
Ziyue Jiang
在 zju.edu.cn 的电子邮件经过验证
标题
引用次数
引用次数
年份
Geneface: Generalized and high-fidelity audio-driven 3d talking face synthesis
Z Ye, Z Jiang, Y Ren, J Liu, J He, Z Zhao
ICLR 2023, 2023
782023
Mega-tts: Zero-shot text-to-speech at scale with intrinsic inductive bias
Z Jiang, Y Ren, Z Ye, J Liu, C Zhang, Q Yang, S Ji, R Huang, C Wang, ...
arXiv preprint arXiv:2306.03509, 2023
402023
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
Z Jiang, J Liu, Y Ren, J He, Z Ye, S Ji, Q Yang, C Zhang, P Wei, C Wang, ...
ICLR 2024, 2024
23*2024
Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners
R Huang, C Zhang, Y Wang, D Yang, J Tian, Z Ye, L Liu, Z Wang, Z Jiang, ...
Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024
22*2024
Self-Supervised Spoofing Audio Detection Scheme.
J Ziyue, Z Hongcheng, P Li, D Wenbing, R Yanzhen
InterSpeech 2020, 4223-4227, 2020
212020
Textrolspeech: A text style control speech corpus with codec language text-to-speech models
S Ji, J Zuo, M Fang, Z Jiang, F Chen, X Duan, B Huai, Z Zhao
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
182024
FedSpeech: Federated Text-to-Speech with Continual Learning
Z Jiang, Y Ren, M Lei, Z Zhao
IJCAI 2021, 3829-3835, 2021
182021
Geneface++: Generalized and stable real-time audio-driven 3d talking face generation
Z Ye, J He, Z Jiang, R Huang, J Huang, J Liu, Y Ren, X Yin, Z Ma, Z Zhao
arXiv preprint arXiv:2305.00787, 2023
162023
Clapspeech: Learning prosody from text context with contrastive language-audio pre-training
Z Ye, R Huang, Y Ren, Z Jiang, J Liu, J He, X Yin, Z Zhao
ACL 2023, 2023
142023
Real3d-portrait: One-shot realistic 3d talking portrait synthesis
Z Ye, T Zhong, Y Ren, J Yang, W Li, J Huang, Z Jiang, J He, R Huang, ...
ICLR 2024, 2024
122024
FastDiff 2: Revisiting and incorporating GANs and diffusion models in high-fidelity speech synthesis
R Huang, Y Ren, Z Jiang, C Cui, J Liu, Z Zhao
Findings of the Association for Computational Linguistics: ACL 2023, 6994-7009, 2023
62023
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models
Z Jiang, Q Yang, J Zuo, Z Ye, R Huang, Y Ren, Z Zhao
ACL 2023, 2023
62023
Language-codec: Reducing the gaps between discrete codec representation and speech language models
S Ji, M Fang, Z Jiang, R Huang, J Zuo, S Wang, Z Zhao
arXiv preprint arXiv:2402.12208, 2024
42024
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech
S Ji, Z Jiang, H Wang, J Zuo, Z Zhao
ACL 2024, 2024
42024
Ada-TTA: Towards adaptive high-quality text-to-talking avatar synthesis
Z Ye, Z Jiang, Y Ren, J Liu, C Zhang, X Yin, Z Ma, Z Zhao
arXiv preprint arXiv:2306.03504, 2023
42023
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Z Jiang, Z Su, Z Zhao, Q Yang, Y Ren, J Liu, Z Ye
NeurIPS 2022, 2022
42022
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
Q Yang, J Xu, W Liu, Y Chu, Z Jiang, X Zhou, Y Leng, Y Lv, Z Zhao, ...
ACL 2024, 2024
32024
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
S Ji, J Zuo, M Fang, S Zheng, Q Chen, W Wang, Z Jiang, H Huang, ...
arXiv preprint arXiv:2406.01205, 2024
22024
MulliVC: Multi-lingual Voice Conversion With Cycle Consistency
J Huang, C Zhang, Y Ren, Z Jiang, Z Ye, J Liu, J He, X Yin, Z Zhao
arXiv preprint arXiv:2408.04708, 2024
2024
MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
Q Yang, J Zuo, Z Su, Z Jiang, M Li, Z Zhao, F Chen, Z Wang, B Huai
arXiv preprint arXiv:2407.14006, 2024
2024
系统目前无法执行此操作,请稍后再试。
文章 1–20