D Zheng, Y Wang, E Shi, R Zhang,
Y Ma… - arXiv preprint arXiv …, 2024 - arxiv.org
To evaluate the code generation capabilities of Large Language Models (LLMs) in complex
real-world software development scenarios, many evaluation approaches have been …