Audioldm: Text-to-audio generation with latent diffusion models H Liu, Z Chen, Y Yuan, X Mei, X Liu, D Mandic, W Wang, MD Plumbley arXiv preprint arXiv:2301.12503, 2023 | 290 | 2023 |
Audioldm 2: Learning holistic audio generation with self-supervised pretraining H Liu, Y Yuan, X Liu, X Mei, Q Kong, Q Tian, Y Wang, W Wang, Y Wang, ... IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 73* | 2024 |
Separate anything you describe X Liu, Q Kong, Y Zhao, H Liu, Y Yuan, Y Liu, R Xia, Y Wang, MD Plumbley, ... arXiv preprint arXiv:2308.05037, 2023 | 14 | 2023 |
Mlops spanning whole machine learning life cycle: A survey F Zhengxin, Y Yi, Z Jingyu, L Yue, M Yuechen, L Qinghua, X Xiwei, W Jeff, ... arXiv preprint arXiv:2304.07296, 2023 | 10 | 2023 |
Latent diffusion model based foley sound generation system for dcase challenge 2023 task 7 Y Yuan, H Liu, X Liu, X Kang, MD Plumbley, W Wang arXiv preprint arXiv:2305.15905, 2023 | 9 | 2023 |
Wavjourney: Compositional audio creation with large language models X Liu, Z Zhu, H Liu, Y Yuan, M Cui, Q Huang, J Liang, Y Cao, Q Kong, ... arXiv preprint arXiv:2307.14335, 2023 | 8 | 2023 |
Text-driven foley sound generation with latent diffusion model Y Yuan, H Liu, X Liu, X Kang, P Wu, MD Plumbley, W Wang arXiv preprint arXiv:2306.10359, 2023 | 7 | 2023 |
Leveraging pre-trained AudioLDM for sound generation: A benchmark study Y Yuan, H Liu, J Liang, X Liu, MD Plumbley, W Wang 2023 31st European Signal Processing Conference (EUSIPCO), 765-769, 2023 | 6 | 2023 |
Retrieval-augmented text-to-audio generation Y Yuan, H Liu, X Liu, Q Huang, MD Plumbley, W Wang ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 5 | 2024 |
PLDISET: Probabilistic localization and detection of independent sound events with transformers P Wu, J Zhao, Y Chen, B Davide, Y Yuan, C Zhu, Y Cao, Y Liu, ... DCASE Workshop, 2023 | 1 | 2023 |
SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound H Liu, X Xu, Y Yuan, M Wu, W Wang, MD Plumbley arXiv preprint arXiv:2405.00233, 2024 | | 2024 |
T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining Y Yuan, Z Chen, X Liu, H Liu, X Xu, D Jia, Y Chen, MD Plumbley, W Wang arXiv preprint arXiv:2404.17806, 2024 | | 2024 |
HFM++: An Enhanced Holographic Factorization Machine for Recommendation Z Fang, M Qu, S Zhang, J Zhang, Y Yuan, L Yao, S Chen Australasian Conference on Data Mining, 72-85, 2021 | | 2021 |