Y Wu, X Tang, TM Mitchell, Y Li - arXiv e-prints, 2023 - ui.adsabs.harvard.edu
Recent large language models (LLMs) have demonstrated great potential toward intelligent
agents and next-gen automation, but there currently lacks a systematic benchmark for …