T Liu, Y Li, Y Lan, H Gao, W Pan, X Xu - arXiv e-prints, 2024 - ui.adsabs.harvard.edu
In offline reinforcement learning, the challenge of out-of-distribution (OOD) is pronounced.
To address this, existing methods often constrain the learned policy through policy …