Action noise in off-policy deep reinforcement learning: Impact on exploration and performance

J Hollenstein, S Auddy, M Saveriano… - arXiv preprint arXiv …, 2022 - arxiv.org
Many Deep Reinforcement Learning (D-RL) algorithms rely on simple forms of exploration
such as the additive action noise often used in continuous control domains. Typically, the …

Colored Noise in PPO: Improved Exploration and Performance Through Correlated Action Sampling

J Hollenstein, G Martius, J Piater - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Proximal Policy Optimization (PPO), a popular on-policy deep reinforcement learning
method, employs a stochastic policy for exploration. In this paper, we propose a colored …