ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis J Xue, Y Deng, Y Han, Y Li, J Sun, J Liang 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | 14 | 2022 |
M2-CTTS: End-to-End Multi-Scale Multi-Modal Conversational Text-to-Speech Synthesis J Xue, Y Deng, F Wang, Y Li, Y Gao, J Tao, J Sun, J Liang ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 13 | 2023 |
Auffusion: Leveraging the power of diffusion and large language models for text-to-audio generation J Xue, Y Deng, Y Gao, Y Li arXiv preprint arXiv:2401.01044, 2024 | 5 | 2024 |
Cmcu-css: Enhancing naturalness via commonsense-based multi-modal context understanding in conversational speech synthesis Y Deng, J Xue, F Wang, Y Gao, Y Li Proceedings of the 31st ACM International Conference on Multimedia, 6081-6089, 2023 | 4 | 2023 |
Concss: Contrastive-based context comprehension for dialogue-appropriate prosody in conversational speech synthesis Y Deng, J Xue, Y Jia, Q Li, Y Han, F Wang, Y Gao, D Ke, Y Li ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 2 | 2024 |
A Dynamic 3D Pronunciation Teaching Model Based on Pronunciation Attributes and Anatomy. X Feng, Y Xie, Y Deng, B Li INTERSPEECH, 1023-1024, 2020 | 2 | 2020 |
Frame-level emotional state alignment method for speech emotion recognition Q Li, Y Gao, C Wang, Y Deng, J Xue, Y Han, Y Li ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 1 | 2024 |
Retrieval Augmented Generation in Prompt-based Text-to-Speech Synthesis with Context-Aware Contrastive Language-Audio Pretraining J Xue, Y Deng, Y Gao, Y Li arXiv preprint arXiv:2406.03714, 2024 | | 2024 |
Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model J Xue, Y Deng, Y Han, Y Gao, Y Li arXiv preprint arXiv:2406.03706, 2024 | | 2024 |
FMPAF: How Do Fed Chairs Affect the Financial Market? A Fine-grained Monetary Policy Analysis Framework on Their Language Y Deng, M Xu, Y Tang arXiv preprint arXiv:2403.06115, 2024 | | 2024 |
Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis Y Deng, D Ke, Y Jia, J Xue, Q Luo, Y Li, J Sun, J Liang, B Lin 2022 13th International Symposium on Chinese Spoken Language Processing …, 2022 | | 2022 |