Audioldm: Text-to-audio generation with latent diffusion models H Liu, Z Chen, Y Yuan, X Mei, X Liu, D Mandic, W Wang, MD Plumbley arXiv preprint arXiv:2301.12503, 2023 | 269 | 2023 |
Wavcaps: A chatgpt-assisted weakly-labelled audio captioning dataset for audio-language multimodal research X Mei, C Meng, H Liu, Q Kong, T Ko, C Zhao, MD Plumbley, Y Zou, ... arXiv preprint arXiv:2303.17395, 2023 | 100* | 2023 |
Audio captioning transformer X Mei, X Liu, Q Huang, MD Plumbley, W Wang arXiv preprint arXiv:2107.09817, 2021 | 68 | 2021 |
Audioldm 2: Learning holistic audio generation with self-supervised pretraining H Liu, Y Yuan, X Liu, X Mei, Q Kong, Q Tian, Y Wang, W Wang, Y Wang, ... IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 64* | 2024 |
On metric learning for audio-text cross-modal retrieval X Mei, X Liu, J Sun, MD Plumbley, W Wang arXiv preprint arXiv:2203.15537, 2022 | 48 | 2022 |
An encoder-decoder based audio captioning system with transfer and reinforcement learning X Mei, Q Huang, X Liu, G Chen, J Wu, Y Wu, J Zhao, S Li, T Ko, HL Tang, ... arXiv preprint arXiv:2108.02752, 2021 | 46 | 2021 |
Automated audio captioning: An overview of recent progress and new challenges X Mei, X Liu, MD Plumbley, W Wang EURASIP journal on audio, speech, and music processing 2022 (1), 26, 2022 | 38 | 2022 |
Separate what you describe: Language-queried audio source separation X Liu, H Liu, Q Kong, X Mei, J Zhao, Q Huang, MD Plumbley, W Wang arXiv preprint arXiv:2203.15147, 2022 | 31 | 2022 |
Leveraging pre-trained bert for audio captioning X Liu, X Mei, Q Huang, J Sun, J Zhao, H Liu, MD Plumbley, V Kilic, ... 2022 30th European Signal Processing Conference (EUSIPCO), 1145-1149, 2022 | 28 | 2022 |
Diverse audio captioning via adversarial training X Mei, X Liu, J Sun, MD Plumbley, W Wang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 25 | 2022 |
CL4AC: A contrastive loss for audio captioning X Liu, Q Huang, X Mei, T Ko, HL Tang, MD Plumbley, W Wang arXiv preprint arXiv:2107.09990, 2021 | 25 | 2021 |
Language-based audio retrieval with pre-trained models X Mei, X Liu, H Liu, J Sun, MD Plumbley, W Wang Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge …, 2022 | 20 | 2022 |
An encoder-decoder based audio captioning system with transfer and reinforcement learning for DCASE challenge 2021 task 6 X Mei, Q Huang, X Liu, G Chen, J Wu, Y Wu, J Zhao, S Li, T Ko, HL Tang, ... DCASE2021 Challenge, Tech. Rep, Tech. Rep, 2021 | 15 | 2021 |
Visually-aware audio captioning with adaptive audio-visual attention X Liu, Q Huang, X Mei, H Liu, Q Kong, J Sun, S Li, T Ko, Y Zhang, ... arXiv preprint arXiv:2210.16428, 2022 | 13 | 2022 |
Simple pooling front-ends for efficient audio classification X Liu, H Liu, Q Kong, X Mei, MD Plumbley, W Wang ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | 11 | 2023 |
Surrey system for dcase 2022 task 5: Few-shot bioacoustic event detection with segment-level metric learning H Liu, X Liu, X Mei, Q Kong, W Wang, MD Plumbley arXiv preprint arXiv:2207.10547, 2022 | 11 | 2022 |
Deep neural decision forest for acoustic scene classification J Sun, X Liu, X Mei, J Zhao, MD Plumbley, V Kılıç, W Wang 2022 30th European Signal Processing Conference (EUSIPCO), 772-776, 2022 | 9 | 2022 |
Automated audio captioning with keywords guidance X Mei, X Liu, H Liu, J Sun, MD Plumbley, W Wang Proc. Detection and Classification of Acoustic Scenes and Events, 2022 | 9 | 2022 |
Segment-level metric learning for few-shot bioacoustic event detection H Liu, X Liu, X Mei, Q Kong, W Wang, MD Plumbley arXiv preprint arXiv:2207.07773, 2022 | 8 | 2022 |
First-shot anomalous sound detection with GMM clustering and finetuned attribute classification using audio pretrained model J Tian, H Zhang, Q Zhu, F Xiao, H Liu, X Mei, Y Liu, W Wang, J Guan DCASE2023 Challenge, Tech. Rep., 2023 | 5 | 2023 |