Haohan Guo 个人学术档案 - 学术资源搜索

引用次数

	总计	2019 年至今
引用	264	264
h 指数	9	9
i10 指数	9	9

2019202020212022202320248 22 40 45 75 70

合著作者

Lei XieNorthwestern Polytechnical University在 nwpu.edu.cn 的电子邮件经过验证
Xixin WuThe Chinese University of Hong Kong在 se.cuhk.edu.hk 的电子邮件经过验证
Lei HePrincipal Scientist Manager, Microsoft在 microsoft.com 的电子邮件经过验证
Feng-Long xieXiaohongshu在 xiaohongshu.com 的电子邮件经过验证
Shaofei ZhangSenior Software Engineer, Microsoft在 microsoft.com 的电子邮件经过验证
Jiawen KangThe Chinese University of Hong Kong在 se.cuhk.edu.hk 的电子邮件经过验证
Shan YangTencent AI Lab在 nwpu-aslp.org 的电子邮件经过验证
Dan SuTencent AI Lab在 tencent.com 的电子邮件经过验证
Chunlei ZhangSEED, Bytedance; Ex-Tencent AI Lab在 bytedance.com 的电子邮件经过验证
Dong Yu (俞栋)Distinguished Scientist @ Tencent AI Lab, ACM/IEEE/ISCA Fellow在 global.tencent.com 的电子邮件经过验证
Yujia XiaoThe Chinese University of Hong Kong在 link.cuhk.edu.hk 的电子邮件经过验证

关注

Haohan Guo

Chinese University of Hong Kong

在 se.cuhk.edu.hk 的电子邮件经过验证 - 首页

Speech Synthesis Voice Conversion Speech Processing


标题按引用次数排序按年份排序按标题排序	引用次数引用次数	年份
Conversational end-to-end tts for voice agents H Guo, S Zhang, FK Soong, L He, L Xie 2021 IEEE Spoken Language Technology Workshop (SLT), 403-409, 2021	64	2021
A new gan-based end-to-end tts training algorithm H Guo, FK Soong, L He, L Xie INTERSPEECH, 2019	60	2019
Exploiting syntactic features in a parsed tree to improve end-to-end TTS H Guo, FK Soong, L He, L Xie INTERSPEECH, 2019	38	2019
BASE TTS: Lessons from building a billion-parameter text-to-speech model on 100K hours of data M Łajszczak, G Cámbara, Y Li, F Beyhan, A van Korlaar, F Yang, A Joly, ... arXiv preprint arXiv:2402.08093, 2024	22	2024
Improving adversarial waveform generation based singing voice conversion with harmonic signals H Guo, Z Zhou, F Meng, K Liu ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022	15	2022
Feature reinforcement with word embedding and parsing information in neural TTS H Ming, L He, H Guo, FK Soong arXiv preprint arXiv:1901.00707, 2019	14	2019
Phonetic posteriorgrams based many-to-many singing voice conversion via adversarial training H Guo, H Lu, N Hu, C Zhang, S Yang, L Xie, D Su, D Yu arXiv preprint arXiv:2012.01837, 2020	11	2020
MSMC-TTS: Multi-stage multi-codebook VQ-VAE based neural TTS H Guo, F Xie, X Wu, FK Soong, H Meng IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 1811-1824, 2023	10	2023
A multi-stage multi-codebook VQ-VAE approach to high-performance neural TTS H Guo, F Xie, FK Soong, X Wu, H Meng arXiv preprint arXiv:2209.10887, 2022	10	2022
A multi-scale time-frequency spectrogram discriminator for GAN-based non-autoregressive TTS H Guo, H Lu, X Wu, H Meng arXiv preprint arXiv:2203.01080, 2022	7	2022
Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations H Guo, F Xie, X Wu, H Lu, H Meng arXiv preprint arXiv:2210.15131, 2022	4	2022
QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning H Guo, F Xie, J Kang, Y Xiao, X Wu, H Meng IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024	3	2024
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition J Kang, L Meng, M Cui, H Guo, X Wu, X Liu, H Meng ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024	2	2024
UniAudio: Towards Universal Audio Generation with Large Language Models D Yang, J Tian, X Tan, R Huang, S Liu, H Guo, X Chang, J Shi, J Bian, ... Forty-first International Conference on Machine Learning, 0	2
Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation H Li, L Xue, H Guo, X Zhu, Y Lv, L Xie, Y Chen, H Yin, Z Li arXiv preprint arXiv:2406.07422, 2024	1	2024
SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models D Yang, D Wang, H Guo, X Chen, X Wu, H Meng arXiv preprint arXiv:2406.02328, 2024	1	2024
SimpleSpeech 2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion Models D Yang, R Huang, Y Wang, H Guo, D Chong, S Liu, X Wu, H Meng arXiv preprint arXiv:2408.13893, 2024		2024
UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner D Yang, H Guo, Y Wang, R Huang, X Li, X Tan, X Wu, H Meng arXiv preprint arXiv:2406.10056, 2024		2024
Addressing Index Collapse of Large-Codebook Speech Tokenizer with Dual-Decoding Product-Quantized Variational Auto-Encoder H Guo, F Xie, D Yang, H Lu, X Wu, H Meng arXiv preprint arXiv:2406.02940, 2024		2024
Unifying One-Shot Voice Conversion and Cloning with Disentangled Speech Representations H Lu, X Wu, H Guo, S Liu, Z Wu, H Meng ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024		2024

系统目前无法执行此操作，请稍后再试。

文章 1–20

每年引用数

重复的引用

合并的引用

添加合著者合著作者

上传 PDF

关注此作者

引用次数

合著作者

引用