Autoact: Automatic agent learning from scratch via self-planning

S Qiao, N Zhang, R Fang, Y Luo, W Zhou… - arXiv preprint arXiv …, 2024 - arxiv.org
Language agents have achieved considerable performance on various complex tasks.
Despite the incessant exploration in this field, existing language agent systems still struggle …

Llm-as-a-judge & reward model: What they can and cannot do

G Son, H Ko, H Lee, Y Kim, S Hong - arXiv preprint arXiv:2409.11239, 2024 - arxiv.org
LLM-as-a-Judge and reward models are widely used alternatives of multiple-choice
questions or human annotators for large language model (LLM) evaluation. Their efficacy …

Trial and error: Exploration-based trajectory optimization for llm agents

Y Song, D Yin, X Yue, J Huang, S Li, BY Lin - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have become integral components in various autonomous
agent systems. In this study, we present an exploration-based trajectory optimization …

A survey on self-evolution of large language models

Z Tao, TE Lin, X Chen, H Li, Y Wu, Y Li, Z Jin… - arXiv preprint arXiv …, 2024 - arxiv.org
Large language models (LLMs) have significantly advanced in various fields and intelligent
agent applications. However, current LLMs that learn from human or external model …

DynaSaur: Large language agents beyond predefined actions

D Nguyen, VD Lai, S Yoon, RA Rossi, H Zhao… - arXiv preprint arXiv …, 2024 - arxiv.org
Existing LLM agent systems typically select actions from a fixed and predefined set at every
step. While this approach is effective in closed, narrowly-scoped environments, we argue …

Nova: An iterative planning and search approach to enhance novelty and diversity of llm generated ideas

X Hu, H Fu, J Wang, Y Wang, Z Li, R Xu, Y Lu… - arXiv preprint arXiv …, 2024 - arxiv.org
Scientific innovation is pivotal for humanity, and harnessing large language models (LLMs)
to generate research ideas could transform discovery. However, existing LLMs often …

From text to life: On the reciprocal relationship between artificial life and large language models

E Nisioti, C Glanois, E Najarro, A Dai… - Artificial Life …, 2024 - direct.mit.edu
Abstract Large Language Models (LLMs) have taken the field of AI by storm, but their
adoption in the field of Artificial Life (ALife) has been, so far, relatively reserved. In this work …

Do multimodal foundation models understand enterprise workflows? a benchmark for business process management tasks

M Wornow, A Narayan, B Viggiano, IS Khare… - arXiv e …, 2024 - ui.adsabs.harvard.edu
Existing ML benchmarks lack the depth and diversity of annotations needed for evaluating
models on business process management (BPM) tasks. BPM is the practice of documenting …

On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models

M Verma, S Bhambri, S Kambhampati - arXiv preprint arXiv:2405.13966, 2024 - arxiv.org
The reasoning abilities of Large Language Models (LLMs) remain a topic of debate. Some
methods such as ReAct-based prompting, have gained popularity for claiming to enhance …

ReAct Meets ActRe: Autonomous Annotations of Agent Trajectories for Contrastive Self-Training

Z Yang, P Li, M Yan, J Zhang, F Huang… - arXiv preprint arXiv …, 2024 - arxiv.org
Language agents have demonstrated autonomous decision-making abilities by reasoning
with foundation models. Recently, efforts have been made to train language agents for …