Eliminating Primacy Bias in Online Reinforcement Learning by Self-Distillation

J Li, H Shi, H Wu, C Zhao… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Excessive invalid explorations at the beginning of training lead deep reinforcement learning
process to fall into the risk of overfitting, further resulting in spurious decisions, which …

Eliminating Primacy Bias in Online Reinforcement Learning by Self-Distillation.

J Li, H Shi, H Wu, C Zhao, KS Hwang - IEEE Transactions on Neural …, 2024 - europepmc.org
Excessive invalid explorations at the beginning of training lead deep reinforcement learning
process to fall into the risk of overfitting, further resulting in spurious decisions, which …

Eliminating Primacy Bias in Online Reinforcement Learning by Self-Distillation

J Li, H Shi, H Wu, C Zhao… - IEEE transactions on … - pubmed.ncbi.nlm.nih.gov
Excessive invalid explorations at the beginning of training lead deep reinforcement learning
process to fall into the risk of overfitting, further resulting in spurious decisions, which …