MHPP: Exploring the Capabilities and Limitations of Language Models Beyond Basic Code Generation J Dai, J Lu, Y Feng, R Ruan, M Cheng, H Tan, Z Guo arXiv preprint arXiv:2405.11430, 2024 | 4 | 2024 |
MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models Z Zeng, Y Liu, Y Wan, J Li, P Chen, J Dai, Y Yao, R Xu, Z Qi, W Zhao, ... NeurIPS 2024, 2024 | 1 | 2024 |
AutoCV: Empowering Reasoning with Automated Process Labeling via Confidence Variation J Lu, Z Dou, H Wang, Z Cao, J Dai, Y Wan, Y Huang, Z Guo NeurIPS 2024, 2024 | 1 | 2024 |
SOAP: Enhancing Efficiency of Generated Code via Self-Optimization D Huang, J Dai, H Weng, P Wu, Y Qing, JM Zhang, H Cui, Z Guo NeurIPS 2024, 2024 | 1 | 2024 |