MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI X Yue*, Y Ni*, K Zhang*, T Zheng*, R Liu, G Zhang, S Stevens, D Jiang, ... CVPR Oral 2024, 2023 | 307 | 2023 |
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement T Zheng, G Zhang, T Shen, X Liu, BY Lin, J Fu, W Chen, X Yue ACL Finding 2024, 2024 | 58 | 2024 |
MAmmoTH2: Scaling Instructions from the Web X Yue*, T Zheng*, G Zhang*, W Chen* NIPS 2024, 2024 | 30 | 2024 |
ChatMusician: Understanding and Generating Music Intrinsically with LLM R Yuan, H Lin, Y Wang, Z Tian, S Wu, T Shen, G Zhang, Y Wu, C Liu, ... ACL Finding 2024, 2024 | 26 | 2024 |
Map-neo: Highly capable and transparent bilingual large language model series G Zhang, S Qu, J Liu, C Zhang, C Lin, CL Yu, D Pan, E Cheng, J Liu, ... arXiv preprint arXiv:2405.19327, 2024 | 15 | 2024 |
COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning Y Bai, X Du, Y Liang, Y Jin, Z Liu, J Zhou, T Zheng, X Zhang, N Ma, ... arXiv preprint arXiv:2403.18058, 2024 | 11* | 2024 |
Cmmmu: A chinese massive multi-discipline multimodal understanding benchmark G Zhang, X Du, B Chen, Y Liang, T Luo, T Zheng, K Zhu, Y Cheng, C Xu, ... arXiv preprint arXiv:2401.11944, 2024 | 11* | 2024 |
Mmmu-pro: A more robust multi-discipline multimodal understanding benchmark X Yue*, T Zheng*, Y Ni*, Y Wang, K Zhang, S Tong, Y Sun, M Yin, B Yu, ... arXiv preprint arXiv:2409.02813, 2024 | 8 | 2024 |
Chinese tiny llm: Pretraining a chinese-centric large language model X Du, Z Yu, S Gao, D Pan, Y Cheng, Z Ma, R Yuan, X Qu, J Liu, T Zheng, ... COLM 2024, 2024 | 8 | 2024 |
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding A Zhuang, G Zhang, T Zheng, X Du, J Wang, W Ren, SW Huang, J Fu, ... COLM 2024, 2024 | 6* | 2024 |
Mupt: A generative symbolic music pretrained transformer X Qu, Y Bai, Y Ma, Z Zhou, KM Lo, J Liu, R Yuan, L Min, X Liu, T Zhang, ... arXiv preprint arXiv:2404.06393, 2024 | 4 | 2024 |
Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation T Zheng, S Guo, X Qu, J Guo, X Du, Q Jia, C Lin, W Huang, J Fu, G Zhang arXiv preprint arXiv:2401.06477, 2024 | 4* | 2024 |
GIEBench: Towards Holistic Evaluation of Group Identity-based Empathy for Large Language Models L Wang, Y Jin, T Shen, T Zheng, X Du, C Zhang, W Huang, J Liu, S Wang, ... arXiv preprint arXiv:2406.14903, 2024 | 3* | 2024 |
CodeEditorBench: Evaluating Code Editing Capability of Large Language Models J Guo, Z Li, X Liu, K Ma, T Zheng, Z Yu, D Pan, Y Li, R Liu, Y Wang, S Guo, ... arXiv preprint arXiv:2404.03543, 2024 | 3 | 2024 |
Lime-m: Less is more for evaluation of mllms K Zhu, Q Zang, S Jia, S Wu, F Fang, Y Li, S Guo, T Zheng, B Li, H Wu, ... arXiv preprint arXiv:2409.06851, 2024 | 2 | 2024 |
Read to play (r2-play): Decision transformer with multimodal game instruction Y Jin, G Zhang, H Zhao, T Zheng, J Guo, L Xiang, S Yue, SW Huang, Z He, ... arXiv preprint arXiv:2402.04154, 2024 | 2 | 2024 |
Dynamic Generation of Personalities with Large Language Models J Liu, H Gu, T Zheng, L Xiang, H Wu, J Fu, Z He arXiv preprint arXiv:2404.07084, 2024 | 1 | 2024 |
DEEP-ICL: Definition-Enriched Experts for Language Model In-Context Learning X Qu, Y Liang, Y Wang, T Zheng, T Yue, L Ma, SW Huang, J Zhang, ... arXiv preprint arXiv:2403.04233, 2024 | 1 | 2024 |
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions Z Li, Q Zang, D Ma, J Guo, T Zheng, X Niu, X Yue, Y Wang, J Yang, J Liu, ... arXiv preprint arXiv:2410.20424, 2024 | | 2024 |
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model S Wu*, Z Peng*, X Du*, T Zheng*, M Liu, J Wu, J Ma, Y Li, J Yang, W Zhou, ... arXiv preprint arXiv:2410.13639, 2024 | | 2024 |