Phonetic posteriorgrams for many-to-one voice conversion without parallel data training L Sun, K Li, H Wang, S Kang, H Meng 2016 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2016 | 360 | 2016 |
Voice conversion using deep bidirectional long short-term memory based recurrent neural networks L Sun, S Kang, K Li, H Meng 2015 IEEE international conference on acoustics, speech and signal …, 2015 | 331 | 2015 |
Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends ZH Ling, SY Kang, H Zen, A Senior, M Schuster, XJ Qian, HM Meng, ... IEEE Signal Processing Magazine 32 (3), 35-52, 2015 | 311* | 2015 |
DurIAN: Duration Informed Attention Network for Speech Synthesis. C Yu, H Lu, N Hu, M Yu, C Weng, K Xu, P Liu, D Tuo, S Kang, G Lei, D Su, ... Interspeech, 2027-2031, 2020 | 198* | 2020 |
Multi-Distribution Deep Belief Network for Speech Synthesis S Kang, X Qian, H Meng ICASSP 2013, 8012-8016, 2013 | 140 | 2013 |
Audio-visual recognition of overlapped speech for the lrs2 dataset J Yu, SX Zhang, J Wu, S Ghorbani, B Wu, S Kang, S Liu, X Liu, H Meng, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 99 | 2020 |
A deep recurrent approach for acoustic-to-articulatory inversion P Liu, Q Yu, Z Wu, S Kang, H Meng, L Cai 2015 IEEE International Conference on Acoustics, Speech and Signal …, 2015 | 90 | 2015 |
Neural network language modeling with letter-based features and importance sampling H Xu, K Li, Y Wang, J Wang, S Kang, X Chen, D Povey, S Khudanpur 2018 IEEE international conference on acoustics, speech and signal …, 2018 | 86 | 2018 |
Fullsubnet+: Channel attention fullsubnet with complex spectrograms for speech enhancement J Chen, Z Wang, D Tuo, Z Wu, S Kang, H Meng ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 82 | 2022 |
Personalized, Cross-Lingual TTS Using Phonetic Posteriorgrams. L Sun, H Wang, S Kang, K Li, HM Meng Interspeech, 322-326, 2016 | 75 | 2016 |
One-shot voice conversion with global speaker embeddings. H Lu, Z Wu, D Dai, R Li, S Kang, J Jia, H Meng Interspeech, 669-673, 2019 | 47 | 2019 |
End-to-end accent conversion without using native utterances S Liu, D Wang, Y Cao, L Sun, X Wu, S Kang, Z Wu, X Liu, D Su, D Yu, ... ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 40 | 2020 |
Learning cross-lingual information with multilingual BLSTM for speech synthesis of low-resource languages Q Yu, P Liu, Z Wu, S Kang, H Meng, L Cai 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016 | 35 | 2016 |
Lexical stress detection for L2 English speech using deep belief networks. K Li, X Qian, S Kang, H Meng Interspeech, 1811-1815, 2013 | 34 | 2013 |
On the localness modeling for the self-attention based end-to-end speech synthesis S Yang, H Lu, S Kang, L Xue, J Xiao, D Su, L Xie, D Yu Neural networks 125, 121-130, 2020 | 31 | 2020 |
Adversarially learning disentangled speech representations for robust multi-factor voice conversion J Wang, J Li, X Zhao, Z Wu, S Kang, H Meng arXiv preprint arXiv:2102.00184, 2021 | 26 | 2021 |
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-Trained BERT. D Dai, Z Wu, S Kang, X Wu, J Jia, D Su, D Yu, H Meng Interspeech, 2090-2094, 2019 | 25 | 2019 |
Transferring source style in non-parallel voice conversion S Liu, Y Cao, S Kang, N Hu, X Liu, D Su, D Yu, H Meng arXiv preprint arXiv:2005.09178, 2020 | 23 | 2020 |
Disentangling content and fine-grained prosody information via hybrid asr bottleneck features for voice conversion X Zhao, F Liu, C Song, Z Wu, S Kang, D Tuo, H Meng ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 21 | 2022 |
Speech super-resolution using parallel wavenet M Wang, Z Wu, S Kang, X Wu, J Jia, D Su, D Yu, H Meng 2018 11th International Symposium on Chinese Spoken Language Processing …, 2018 | 21 | 2018 |