Agentbench: Evaluating llms as agents X Liu, H Yu, H Zhang, Y Xu, X Lei, H Lai, Y Gu, H Ding, K Men, K Yang, ... ICLR 2024, 2023 | 220* | 2023 |
SafetyBench: Evaluating the Safety of Large Language Models Z Zhang, L Lei, L Wu, R Sun, Y Huang, C Long, X Liu, X Lei, J Tang, ... Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024 | 72* | 2024 |
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation P Ke, B Wen, A Feng, X Liu, X Lei, J Cheng, S Wang, A Zeng, Y Dong, ... Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024 | 24* | 2024 |
AlignBench: Benchmarking Chinese Alignment of Large Language Models X Liu*, X Lei*, S Wang, Y Huang, Z Feng, B Wen, J Cheng, P Ke, Y Xu, ... ACL 2024, 2023 | 21 | 2023 |
Kaiwen Men, Kejuan Yang, et al. 2023b. Agentbench: Evaluating llms as agents X Liu, H Yu, H Zhang, Y Xu, X Lei, H Lai, Y Gu, H Ding arXiv preprint arXiv:2308.03688, 0 | 19 | |
XDAI: A tuning-free framework for exploiting pre-trained language models in knowledge grounded dialogue generation J Yu, X Zhang, Y Xu, X Lei, X Guan, J Zhang, L Hou, J Li, J Tang Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022 | 13 | 2022 |
Scaffolding coordinates to promote vision-language coordination in large multi-modal models X Lei, Z Yang, X Chen, P Li, Y Liu arXiv preprint arXiv:2402.12058, 2024 | 7 | 2024 |
A Cause-Effect Look at Alleviating Hallucination of Knowledge-grounded Dialogue Generation J Yu, X Zhang, Y Xu, X Lei, Z Yao, J Zhang, L Hou, J Li COLING 2024, 2024 | | 2024 |