We propose an intuitive LLM prompting framework (AgentKit) for multifunctional agents. AgentKit offers a unified framework for explicitly constructing a complex" thought process" …
Existing grasp prediction approaches are mostly based on offline learning, while, ignored the exploratory grasp learning during online adaptation to new picking scenarios, ie, unseen …
This paper introduces an effective and practical step toward approximate Bayesian inference in on-policy actor-critic deep reinforcement learning. This step manifests as three …
Despite recent progress in offline learning, these methods are still trained and tested on the same environment. In this paper, we compare the generalization abilities of widely used …
Reinforcement learning agents may sometimes develop habits that are effective only when specific policies are followed. After an initial exploration phase in which agents try out …
Z Sun, H Shi, MA Côté, G Berseth, X Yuan… - arXiv preprint arXiv …, 2024 - arxiv.org
While large language models (LLMs) have been increasingly deployed across tasks in language understanding and interactive decision-making, their impressive performance is …
Effective interlocutors account for the uncertain goals, beliefs, and emotions of others. But even the best human conversationalist cannot perfectly anticipate the trajectory of a …
Despite the recent progress in offline reinforcement learning (RL) algorithms, agents are usually trained and tested on the same environment. In this paper, we perform an in-depth …
We present $\varepsilon $-retrain, an exploration strategy designed to encourage a behavioral preference while optimizing policies with monotonic improvement guarantees. To …