Video-LLaVA: Learning United Visual Representation by Alignment Before Projection B Lin, B Zhu, Y Ye, M Ning, P Jin, L Yuan arXiv preprint arXiv:2311.10122, 2023 | 109 | 2023 |
Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations P Jin, J Huang, F Liu, X Wu, S Ge, G Song, D Clifton, J Chen NeurIPS 2022 Spotlight 35, 30291-30306, 2022 | 41 | 2022 |
Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning P Jin, J Huang, P Xiong, S Tian, C Liu, X Ji, L Yuan, J Chen CVPR 2023 Highlight, 2472-2482, 2023 | 40 | 2023 |
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models B Lin, Z Tang, Y Ye, J Cui, B Zhu, P Jin, J Zhang, M Ning, L Yuan arXiv preprint arXiv:2401.15947, 2024 | 33 | 2024 |
DiffusionRet: Generative Text-Video Retrieval with Diffusion Model P Jin, H Li, Z Cheng, K Li, X Ji, C Liu, L Yuan, J Chen ICCV 2023, 2470-2481, 2023 | 33 | 2023 |
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding P Jin, R Takanobu, W Zhang, X Cao, L Yuan CVPR 2024 Highlight, 13700-13710, 2024 | 30 | 2024 |
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models D Liu, R Zhang, L Qiu, S Huang, W Lin, S Zhao, S Geng, Z Lin, P Jin, ... ICML 2024, 2024 | 29* | 2024 |
Weakly-Supervised 3D Spatial Reasoning for Text-based Visual Question Answering H Li, J Huang, P Jin, G Song, Q Wu, J Chen IEEE Transactions on Image Processing, 2023 | 28* | 2023 |
Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment P Jin, H Li, Z Cheng, J Huang, Z Wang, L Yuan, C Liu, J Chen IJCAI 2023, 938-946, 2023 | 18 | 2023 |
Parallel Vertex Diffusion for Unified Visual Grounding Z Cheng, K Li, P Jin, X Ji, L Yuan, C Liu, J Chen AAAI 2024, 1326-1334, 2024 | 13 | 2024 |
TG-VQA: Ternary Game of Video Question Answering H Li, P Jin, Z Cheng, S Zhang, K Chen, Z Wang, C Liu, J Chen IJCAI 2023, 1044-1052, 2023 | 11 | 2023 |
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs P Jin, Y Wu, Y Fan, Z Sun, W Yang, L Yuan NeurIPS 2023, 2023 | 11 | 2023 |
Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting J Zhang, Z Tang, Y Pang, X Cheng, P Jin, Y Wei, W Yu, M Ning, L Yuan arXiv preprint arXiv:2312.13271, 2023 | 10 | 2023 |
Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation K Li, Y Zhao, Z Wang, Z Cheng, P Jin, X Ji, L Yuan, C Liu, J Chen ICCV 2023, 666-676, 2023 | 6 | 2023 |
LLMBind: A Unified Modality-Task Integration Framework B Zhu, P Jin, M Ning, B Lin, J Huang, Q Song, M Pan, L Yuan arXiv preprint arXiv:2402.14891, 2024 | 3 | 2024 |
FreestyleRet: Retrieving Images from Style-Diversified Queries H Li, C Jia, P Jin, Z Cheng, K Li, J Sui, C Liu, L Yuan arXiv preprint arXiv:2312.02428, 2023 | 3 | 2023 |
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation Z Cheng, K Li, H Li, P Jin, C Liu, X Zheng, R Ji, J Chen arXiv preprint arXiv:2401.09732, 2024 | 2 | 2024 |
WiCo: Win-win Cooperation of Bottom-up and Top-down Referring Image Segmentation Z Cheng, P Jin, H Li, K Li, S Li, X Ji, C Liu, J Chen IJCAI 2023, 636-644, 2023 | 2 | 2023 |
RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter M Cao, H Tang, J Huang, P Jin, C Zhang, R Liu, L Chen, X Liang, L Yuan, ... ACL 2024 Findings, 2024 | | 2024 |