Memory-augmented theory of mind network

D Nguyen, P Nguyen, H Le, K Do… - Proceedings of the …, 2023 - ojs.aaai.org
Social reasoning necessitates the capacity of theory of mind (ToM), the ability to
contextualise and attribute mental states to others without having access to their internal …

Neural episodic control with state abstraction

Z Li, D Zhu, Y Hu, X Xie, L Ma, Y Zheng, Y Song… - arXiv preprint arXiv …, 2023 - arxiv.org
Existing Deep Reinforcement Learning (DRL) algorithms suffer from sample inefficiency.
Generally, episodic control-based approaches are solutions that leverage highly-rewarded …

Policy Optimization with Smooth Guidance Rewards Learned from Sparse-Reward Demonstrations

G Wang, F Wu, X Zhang, T Chen - arXiv preprint arXiv:2401.00162, 2023 - arxiv.org
The sparsity of reward feedback remains a challenging problem in online deep
reinforcement learning (DRL). Previous approaches have utilized temporal credit …

Temporally extended successor feature neural episodic control

X Zhu - Scientific Reports, 2024 - nature.com
One of the long-term goals of reinforcement learning is to build intelligent agents capable of
rapidly learning and flexibly transferring skills, similar to humans and animals. In this paper …

Continuous episodic control

Z Yang, TM Moerland, M Preuss… - 2023 IEEE Conference …, 2023 - ieeexplore.ieee.org
Non-parametric episodic memory can be used to quickly latch onto high-rewarded
experience in reinforcement learning tasks. In contrast to parametric deep reinforcement …

Episodic policy gradient training

H Le, M Abdolshah, TK George, K Do… - Proceedings of the …, 2022 - ojs.aaai.org
We introduce a novel training procedure for policy gradient methods wherein episodic
memory is used to optimize the hyperparameters of reinforcement learning algorithms on …

Learning to constrain policy optimization with virtual trust region

TH Le, T Karimpanal George… - Advances in …, 2022 - proceedings.neurips.cc
We introduce a constrained optimization method for policy gradient reinforcement learning,
which uses two trust regions to regulate each policy update. In addition to using the …

[PDF][PDF] Beyond Surprise: Improving Exploration Through Surprise Novelty.

H Le, K Do, D Nguyen, S Venkatesh - AAMAS, 2024 - thaihungle.github.io
What motivates agents to explore? Successfully answering this question would enable
agents to learn efficiently in formidable tasks. Random explorations such as 𝜖-greedy are …

Large Language Models Prompting With Episodic Memory

D Do, Q Tran, S Venkatesh, H Le - arXiv preprint arXiv:2408.07465, 2024 - arxiv.org
Prompt optimization is essential for enhancing the performance of Large Language Models
(LLMs) in a range of Natural Language Processing (NLP) tasks, particularly in scenarios of …

Computational modeling of the interactions between episodic memory and cognitive control

H Chateau-Laurent - 2024 - theses.hal.science
Episodic memory is often illustrated with the madeleine de Proust excerpt as the ability to re-
experience a situation from the past following the perception of a stimulus. This simplistic …