Deep multimodal clustering for unsupervised audiovisual learning D Hu, F Nie, X Li Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2019 | 248 | 2019 |
Multiple Sound Sources Localization from Coarse to Fine R Qian, D Hu, H Dinkel, M Wu, N Xu, W Lin arXiv preprint arXiv:2007.06355, 2020 | 161 | 2020 |
Deep binary reconstruction for cross-modal hashing X Li, D Hu, F Nie Proceedings of the 25th ACM international conference on Multimedia, 1398-1406, 2017 | 159 | 2017 |
Balanced multimodal learning via on-the-fly gradient modulation X Peng, Y Wei, A Deng, D Wang, D Hu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 151 | 2022 |
Discriminative sounding objects localization via self-supervised audiovisual matching D Hu, R Qian, M Jiang, X Tan, S Wen, E Ding, W Lin, D Dou Advances in Neural Information Processing Systems 33, 10077-10087, 2020 | 142 | 2020 |
Temporal multimodal learning in audiovisual speech recognition D Hu, X Li Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2016 | 122 | 2016 |
Unsupervised multi-source domain adaptation for person re-identification Z Bai, Z Wang, J Wang, D Hu, E Ding Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021 | 103 | 2021 |
Learning to answer questions in dynamic audio-visual scenarios G Li, Y Wei, Y Tian, C Xu, JR Wen, D Hu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 94 | 2022 |
Cyclic co-learning of sounding object visual grounding and sound separation Y Tian, D Hu, C Xu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021 | 91 | 2021 |
Large graph hashing with spectral rotation X Li, D Hu, F Nie Proceedings of the AAAI Conference on Artificial Intelligence 31 (1), 2017 | 66 | 2017 |
Temporal relational modeling with self-supervision for action segmentation D Wang, D Hu, X Li, D Dou Proceedings of the AAAI conference on artificial intelligence 35 (4), 2729-2737, 2021 | 53 | 2021 |
Learning in audio-visual context: A review, analysis, and new perspective Y Wei, D Hu, Y Tian, X Li arXiv preprint arXiv:2208.09579, 2022 | 52 | 2022 |
Dense multimodal fusion for hierarchically joint representation D Hu, C Wang, F Nie, X Li ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 47 | 2019 |
Curriculum audiovisual learning D Hu, Z Wang, H Xiong, D Wang, F Nie, D Dou arXiv preprint arXiv:2001.09414, 2020 | 44 | 2020 |
Self-supervised audiovisual representation learning for remote sensing data K Heidler, L Mou, D Hu, P Jin, G Li, C Gan, JR Wen, XX Zhu International Journal of Applied Earth Observation and Geoinformation 116 …, 2023 | 40 | 2023 |
Class-aware sounding objects localization via audiovisual correspondence D Hu, Y Wei, R Qian, W Lin, R Song, JR Wen IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (12), 9844 …, 2021 | 38 | 2021 |
Discrete spectral hashing for efficient similarity retrieval D Hu, F Nie, X Li IEEE Transactions on Image Processing 28 (3), 1080-1091, 2018 | 37 | 2018 |
Cross-Task Transfer for Geotagged Audiovisual Aerial Scene Recognition D Hu, X Li, L Mou, P Jin, D Chen, L Jing, X Zhu, D Dou arXiv preprint arXiv:2005.08449, 2020 | 34* | 2020 |
Ambient sound helps: Audiovisual crowd counting in extreme conditions D Hu, L Mou, Q Wang, J Gao, Y Hua, D Dou, XX Zhu arXiv preprint arXiv:2005.07097, 2020 | 33 | 2020 |
Visual sound localization in the wild by cross-modal interference erasing X Liu, R Qian, H Zhou, D Hu, W Lin, Z Liu, B Zhou, X Zhou Proceedings of the AAAI Conference on Artificial Intelligence 36 (2), 1801-1809, 2022 | 26 | 2022 |