The rise and potential of large language model based agents: A survey

Z Xi, W Chen, X Guo, W He, Y Ding, B Hong… - Science China …, 2025 - Springer
For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …

Benchmarks for automated commonsense reasoning: A survey

E Davis - ACM Computing Surveys, 2023 - dl.acm.org
More than one hundred benchmarks have been developed to test the commonsense
knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems …

Grounding large language models in interactive environments with online reinforcement learning

T Carta, C Romac, T Wolf, S Lamprier… - International …, 2023 - proceedings.mlr.press
Recent works successfully leveraged Large Language Models'(LLM) abilities to capture
abstract knowledge about world's physics to solve decision-making problems. Yet, the …

Cognitive architectures for language agents

TR Sumers, S Yao, K Narasimhan… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent efforts have incorporated large language models (LLMs) with external resources (eg,
the Internet) or internal control flows (eg, prompt chaining) for tasks requiring grounding or …

Swiftsage: A generative agent with fast and slow thinking for complex interactive tasks

BY Lin, Y Fu, K Yang, F Brahman… - Advances in …, 2024 - proceedings.neurips.cc
We introduce SwiftSage, a novel agent framework inspired by the dual-process theory of
human cognition, designed to excel in action planning for complex interactive reasoning …

Agenttuning: Enabling generalized agent abilities for llms

A Zeng, M Liu, R Lu, B Wang, X Liu, Y Dong… - arXiv preprint arXiv …, 2023 - arxiv.org
Open large language models (LLMs) with great performance in various tasks have
significantly advanced the development of LLMs. However, they are far inferior to …

Put your money where your mouth is: Evaluating strategic planning and execution of llm agents in an auction arena

J Chen, S Yuan, R Ye, BP Majumder… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent advancements in Large Language Models (LLMs) showcase advanced reasoning,
yet NLP evaluations often depend on static benchmarks. Evaluating this necessitates …

Understanding the planning of LLM agents: A survey

X Huang, W Liu, X Chen, X Wang, H Wang… - arXiv preprint arXiv …, 2024 - arxiv.org
As Large Language Models (LLMs) have shown significant intelligence, the progress to
leverage LLMs as planning modules of autonomous agents has attracted more attention …

Fusing pre-trained language models with multimodal prompts through reinforcement learning

Y Yu, J Chung, H Yun, J Hessel… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Language models are capable of commonsense reasoning: while domain-specific
models can learn from explicit knowledge (eg commonsense graphs [6], ethical norms [25]) …

T-eval: Evaluating the tool utilization capability of large language models step by step

Z Chen, W Du, W Zhang, K Liu, J Liu… - Proceedings of the …, 2024 - aclanthology.org
Large language models (LLMs) have achieved remarkable performance on various NLP
tasks and are augmented by tools for broader applications. Yet, how to evaluate and …