FILIP: fine-grained interactive language-image pre-training L Yao*, R Huang*, L Hou*, G Lu, M Niu, H Xu, X Liang, Z Li, X Jiang, C Xu arXiv preprint arXiv:2111.07783, 2021 | 478 | 2021 |
Wukong: 100 million large-scale chinese cross-modal pre-training dataset and a foundation framework HX Jiaxi Gu, Xiaojun Meng, Guansong Lu, Lu Hou, Minzhe Niu, Xiaodan Liang ... arXiv preprint arXiv:2202.06767, 2022 | 83* | 2022 |
Deep Feature Fusion with Multiple Granularity for Vehicle Re-identification. P Huang, R Huang, J Huang, R Yangchen, Z He, X Li, J Chen CVPR workshops, 80-88, 2019 | 25 | 2019 |
Nlip: Noise-robust language-image pre-training R Huang, Y Long, J Han, H Xu, X Liang, C Xu, X Liang Proceedings of the AAAI Conference on Artificial Intelligence 37 (1), 926-934, 2023 | 22 | 2023 |
Fine-grained visual–text prompt-driven self-training for open-vocabulary object detection Y Long, J Han, R Huang, H Xu, Y Zhu, C Xu, X Liang IEEE Transactions on Neural Networks and Learning Systems, 2023 | 11 | 2023 |
Boosting visual-language models by exploiting hard samples H Wang, M Huang, R Huang, L Hong, H Xu, T Hu, X Liang, Z Li, H Cheng, ... arXiv preprint arXiv:2305.05208, 2023 | 5 | 2023 |
Growclip: Data-aware automatic model growing for large-scale contrastive language-image pre-training X Deng, H Shi, R Huang, C Li, H Xu, J Han, J Kwok, S Zhao, W Zhang, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 2 | 2023 |
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models R Huang, X Ding, C Wang, J Han, Y Liu, H Zhao, H Xu, L Hou, W Zhang, ... arXiv preprint arXiv:2407.08706, 2024 | 1 | 2024 |
UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning X Dong, R Huang, X Wei, Z Jie, J Yu, J Yin, X Liang arXiv preprint arXiv:2306.00813, 2023 | 1 | 2023 |
LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model R Huang, K Cai, J Han, X Liang, R Pei, G Lu, S Xu, W Zhang, H Xu arXiv preprint arXiv:2403.11929, 2024 | | 2024 |
System and method for cross-modal interaction based on pre-trained model H Xu, HOU Lu, LU Guansong, NIU Minzhe, Z Li, R Huang, YAO Lewei, ... US Patent App. 17/900,592, 2024 | | 2024 |
DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability R Huang, J Han, G Lu, X Liang, Y Zeng, W Zhang, H Xu Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | | 2023 |