Dual Alignment Maximin Optimization for Offline Model-based RL

C Zhou, W Luo, H Li, C Han, T Guo, Z Zhang - arXiv preprint arXiv …, 2025 - arxiv.org
Offline reinforcement learning agents face significant deployment challenges due to the
synthetic-to-real distribution mismatch. While most prior research has focused on improving …