Audio-visual event localization in unconstrained videos Y Tian, J Shi, B Li, Z Duan, C Xu Proceedings of the European Conference on Computer Vision (ECCV), 247-263, 2018 | 449 | 2018 |
Hierarchical cross-modal talking face generation with dynamic pixel-wise loss L Chen, RK Maddox, Z Duan, C Xu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 385 | 2019 |
Automatic Music Transcription: An Overview E Benetos, S Dixon, Z Duan, S Ewert IEEE Signal Processing Magazine 36 (1), 20-30, 2018 | 314 | 2018 |
Lip movements generation at a glance L Chen, Z Li, RK Maddox, Z Duan, C Xu Proceedings of the European Conference on Computer Vision (ECCV), 520-535, 2018 | 243 | 2018 |
Deep Cross-Modal Audio-Visual Generation L Chen, S Srivastava, Z Duan, C Xu Proceedings of the on Thematic Workshops of ACM Multimedia 2017, 349-357, 2017 | 225 | 2017 |
Multiple fundamental frequency estimation by modeling spectral peaks and non-peak regions Z Duan, B Pardo, C Zhang IEEE Transactions on Audio, Speech, and Language Processing 18 (8), 2121-2133, 2010 | 218 | 2010 |
One-class learning towards synthetic voice spoofing detection Y Zhang, F Jiang, Z Duan IEEE Signal Processing Letters 28, 937-941, 2021 | 191 | 2021 |
Creating a Multitrack Classical Music Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and Applications B Li, X Liu, K Dinesh, Z Duan, G Sharma IEEE Transactions on Multimedia 21 (2), 522-535, 2018 | 172 | 2018 |
Creating a Multitrack Classical Music Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and Applications B Li, X Liu, K Dinesh, Z Duan, G Sharma IEEE Transactions on Multimedia 21 (2), 522-535, 2018 | 172 | 2018 |
Soundprism: An online system for score-informed source separation of music audio Z Duan, B Pardo IEEE Journal of Selected Topics in Signal Processing 5 (6), 1205-1215, 2011 | 129 | 2011 |
Unsupervised single-channel music source separation by average harmonic structure modeling Z Duan, Y Zhang, C Zhang, Z Shi IEEE Transactions on Audio, Speech, and Language Processing 16 (4), 766-778, 2008 | 122 | 2008 |
Bidirectional GRU for sound event detection R Lu, Z Duan Detection and Classification of Acoustic Scenes and Events, 1-3, 2017 | 81 | 2017 |
Unsupervised Learning Approach to Feature Analysis for Automatic Speech Emotion Recognition SE Eskimez, Z Duan, W Heinzelman 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 77 | 2018 |
Multi-pitch streaming of harmonic sound mixtures Z Duan, J Han, B Pardo IEEE/ACM Transactions on Audio, Speech, and Language Processing 22 (1), 138-150, 2014 | 68 | 2014 |
Speech driven talking face generation from a single image and an emotion condition SE Eskimez, Y Zhang, Z Duan IEEE Transactions on Multimedia 24, 3480-3490, 2021 | 67 | 2021 |
Online PLCA for Real-Time Semi-supervised Source Separation Z Duan, G Mysore, P Smaragdis International Conference on Latent Variable Analysis and Signal Separation …, 2012 | 67 | 2012 |
A state space model for online polyphonic audio-score alignment Z Duan, B Pardo IEEE International Conference on Acoustics, Speech and Signal Processing …, 2011 | 65 | 2011 |
Speech enhancement by online non-negative spectrogram decomposition in nonstationary noise environments Z Duan, GJ Mysore, P Smaragdis Thirteenth Annual Conference of the International Speech Communication …, 2012 | 64 | 2012 |
Siamese Style Convolutional Neural Networks for Sound Search by Vocal Imitation Y Zhang, B Pardo, Z Duan IEEE/ACM Transactions on Audio, Speech, and Language Processing 27 (2), 429-441, 2018 | 60 | 2018 |
Generating talking face landmarks from speech SE Eskimez, RK Maddox, C Xu, Z Duan Latent Variable Analysis and Signal Separation: 14th International …, 2018 | 60 | 2018 |