If llm is the wizard, then code is the wand: A survey on how code empowers large language models to serve as intelligent agents

K Yang, J Liu, J Wu, C Yang, YR Fung, S Li… - arXiv preprint arXiv …, 2024 - arxiv.org
The prominent large language models (LLMs) of today differ from past language models not
only in size, but also in the fact that they are trained on a combination of natural language …

Executable code actions elicit better llm agents

X Wang, Y Chen, L Yuan, Y Zhang, Y Li… - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Model (LLM) agents, capable of performing a broad range of actions, such
as invoking tools and controlling robots, show great potential in tackling real-world …

Smartplay: A benchmark for llms as intelligent agents

Y Wu, X Tang, TM Mitchell, Y Li - arXiv preprint arXiv:2310.01557, 2023 - arxiv.org
Recent large language models (LLMs) have demonstrated great potential toward intelligent
agents and next-gen automation, but there currently lacks a systematic benchmark for …

Modelscope-agent: Building your customizable agent system with open-source large language models

C Li, H Chen, M Yan, W Shen, H Xu, Z Wu… - arXiv preprint arXiv …, 2023 - arxiv.org
Large language models (LLMs) have recently demonstrated remarkable capabilities to
comprehend human intentions, engage in reasoning, and design planning-like behavior. To …

Tptu: Task planning and tool usage of large language model-based ai agents

J Ruan, Y Chen, B Zhang, Z Xu, T Bao, G Du… - arXiv preprint arXiv …, 2023 - arxiv.org
With recent advancements in natural language processing, Large Language Models (LLMs)
have emerged as powerful tools for various real-world applications. Despite their prowess …

Middleware for llms: Tools are instrumental for language agents in complex environments

Y Gu, Y Shu, H Yu, X Liu, Y Dong, J Tang… - arXiv preprint arXiv …, 2024 - arxiv.org
The applications of large language models (LLMs) have expanded well beyond the confines
of text processing, signaling a new era where LLMs are envisioned as generalist language …

L2ceval: Evaluating language-to-code generation capabilities of large language models

A Ni, P Yin, Y Zhao, M Riddell, T Feng, R Shen… - arXiv preprint arXiv …, 2023 - arxiv.org
Recently, large language models (LLMs), especially those that are pretrained on code, have
demonstrated strong capabilities in generating programs from natural language inputs in a …

Lumos: Learning agents with unified data, modular design, and open-source llms

D Yin, F Brahman, A Ravichander, K Chandu… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce Lumos, a novel framework for training language agents that employs a unified
data format and a modular architecture based on open-source large language models …

The rise and potential of large language model based agents: A survey

Z Xi, W Chen, X Guo, W He, Y Ding, B Hong… - arXiv preprint arXiv …, 2023 - arxiv.org
For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing
the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are …

Benchmarking large language models as ai research agents

Q Huang, J Vora, P Liang, J Leskovec - arXiv preprint arXiv:2310.03302, 2023 - arxiv.org
Scientific experimentation involves an iterative process of creating hypotheses, designing
experiments, running experiments, and analyzing the results. Can we build AI research …