Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network W Wang, E Xie, X Song, Y Zang, W Wang, T Lu, G Yu, C Shen IEEE International Conference on Computer Vision (ICCV), 2019 | 538 | 2019 |
Seesaw Loss for Long-Tailed Instance Segmentation J Wang, W Zhang, Y Zang, Y Cao, J Pang, T Gong, K Chen, Z Liu, CC Loy, ... IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021 | 256 | 2021 |
Scene Text Detection with Supervised Pyramid Context Network E Xie, Y Zang, S Shao, G Yu, C Yao, G Li AAAI Conference on Artificial Intelligence (AAAI), 2019 | 249 | 2019 |
Open-Vocabulary DETR with Conditional Matching Y Zang, W Li, K Zhou, C Huang, CC Loy European Conference on Computer Vision (ECCV), 2022 | 155 | 2022 |
Unified Vision and Language Prompt Learning Y Zang, W Li, K Zhou, C Huang, CC Loy arXiv preprint arXiv:2210.07225, 2022 | 121 | 2022 |
FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation Y Zang, C Huang, CC Loy IEEE International Conference on Computer Vision (ICCV), 2021 | 108 | 2021 |
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, X Wei, S Zhang, ... arXiv preprint arXiv:2401.16420, 2024 | 84 | 2024 |
InternLM2 Technical Report Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ... arXiv preprint arXiv:2403.17297, 2024 | 61 | 2024 |
Contextual Object Detection with Multimodal Large Language Models Y Zang, W Li, J Han, K Zhou, CC Loy arXiv preprint arXiv:2305.18279, 2023 | 37 | 2023 |
InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD X Dong, P Zhang, Y Zang, Y Cao, B Wang, L Ouyang, S Zhang, H Duan, ... arXiv preprint arXiv:2404.06512, 2024 | 36 | 2024 |
Are We on the Right Way for Evaluating Large Vision-Language Models? L Chen, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, J Wang, Y Qiao, ... arXiv preprint arXiv:2403.20330, 2024 | 36 | 2024 |
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want Z Sun, Y Fang, T Wu, P Zhang, Y Zang, S Kong, Y Xiong, D Lin, J Wang IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024 | 24 | 2024 |
Long-CLIP: Unlocking the Long-Text Capability of CLIP B Zhang, P Zhang, X Dong, Y Zang, J Wang arXiv preprint arXiv:2403.15378, 2024 | 22 | 2024 |
1st Place Solutions for OpenImage2019--Object Detection and Instance Segmentation Y Liu, G Song, Y Zang, Y Gao, E Xie, J Yan, CC Loy, X Wang arXiv preprint arXiv:2003.07557, 2020 | 21 | 2020 |
Semi-Supervised and Long-Tailed Object Detection with CascadeMatch Y Zang, K Zhou, C Huang, CC Loy International Journal of Computer Vision (IJCV), 2023 | 11 | 2023 |
KPNet: Towards Minimal Face Detector G Song, Y Liu, Y Zang, X Wang, B Leng, Q Yuan AAAI Conference on Artificial Intelligence (AAAI), 2020 | 10 | 2020 |
On-Device Domain Generalization K Zhou, Y Zhang, Y Zang, J Yang, CC Loy, Z Liu arXiv preprint arXiv:2209.07521, 2022 | 6 | 2022 |
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions L Chen, X Wei, J Li, X Dong, P Zhang, Y Zang, Z Chen, H Duan, B Lin, ... arXiv preprint arXiv:2406.04325, 2024 | 5 | 2024 |
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition Z Liu, Z Sun, Y Zang, W Li, P Zhang, X Dong, Y Xiong, D Lin, J Wang arXiv preprint arXiv:2403.13805, 2024 | 5 | 2024 |
Overcoming the Pitfalls of Vision-Language Model Finetuning for OOD Generalization Y Zang, H Goh, J Susskind, C Huang International Conference on Learning Representations (ICLR), 2024 | 4 | 2024 |