T Liu,
Y Li,
Y Lan, H Gao,
W Pan, X Xu - arXiv preprint arXiv:2405.19909, 2024 - arxiv.org
In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced.
To address this, existing methods often constrain the learned policy through policy …