PromptTTS: Controllable Text-to-Speech with Text Descriptions Z Guo, Y Leng, Y Wu, S Zhao, X Tan ICASSP 2023, 2022 | 73 | 2022 |
Adaspeech 4: Adaptive text to speech in zero-shot scenarios Y Wu, X Tan, B Li, L He, S Zhao, R Song, T Qin, TY Liu InterSpeech 2022, 2022 | 58 | 2022 |
Resgrad: Residual denoising diffusion probabilistic models for text to speech Z Chen, Y Wu, Y Leng, J Chen, H Liu, X Tan, Y Cui, K Wang, L He, S Zhao, ... arXiv preprint arXiv:2212.14518, 2022 | 16 | 2022 |
Self-supervised context-aware style representation for expressive speech synthesis Y Wu, X Wang, S Zhang, L He, R Song, JY Nie InterSpeech 2022, 2022 | 16 | 2022 |
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing Y Wu, J Guo, X Tan, C Zhang, B Li, R Song, L He, S Zhao, A Menezes, ... AAAI 2023, 2022 | 11 | 2022 |
YuLan: An Open-source Large Language Model Y Zhu, K Zhou, K Mao, W Chen, Y Sun, Z Chen, Q Cao, Y Wu, Y Chen, ... arXiv preprint arXiv:2406.19853, 2024 | 1 | 2024 |
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units X Chang, J Shi, J Tian, Y Wu, Y Tang, Y Wu, S Watanabe, Y Adi, X Chen, ... arXiv preprint arXiv:2406.07725, 2024 | 1 | 2024 |
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition Y Wu, S Maiti, Y Peng, W Zhang, C Li, Y Wang, X Wang, S Watanabe, ... arXiv preprint arXiv:2401.18045, 2024 | 1 | 2024 |
Understanding Human Preferences: Towards More Personalized Video to Text Generation Y Wu, R Song, X Chen, H Jiang, Z Cao, J Yu Proceedings of the ACM on Web Conference 2024, 3952-3963, 2024 | | 2024 |
TiVA: Time-Aligned Video-to-Audio Generation X Wang, Y Wang, Y Wu, R Song, X Tan, Z Chen, H Xu, G Sui ACM Multimedia 2024, 2024 | | 2024 |
ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios Y Wang, H Xiao, Y Wu, R Song InterSpeech 2023, 2023 | | 2023 |