Multimodal fusion of ehr in structures and semantics: Integrating clinical records and notes with hypergraph and llm H Cui, X Fang, R Xu, X Kan, JC Ho, C Yang arXiv preprint arXiv:2403.08818, 2024 | 5 | 2024 |
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding X Fang, K Mao, H Duan, X Zhao, Y Li, D Lin, K Chen arXiv preprint arXiv:2406.14515, 2024 | 4 | 2024 |
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models H Duan, J Yang, Y Qiao, X Fang, L Chen, Y Liu, X Dong, Y Zang, P Zhang, ... arXiv preprint arXiv:2407.11691, 2024 | 1 | 2024 |
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs Y Qiao, H Duan, X Fang, J Yang, L Chen, S Zhang, J Wang, D Lin, ... arXiv preprint arXiv:2406.14544, 2024 | 1 | 2024 |
Open visual knowledge extraction via relation-oriented multimodality model prompting H Cui, X Fang, Z Zhang, R Xu, X Kan, X Liu, Y Yu, M Li, Y Song, C Yang Advances in Neural Information Processing Systems 36, 2024 | 1 | 2024 |