Learning robust policy against disturbance in transition dynamics via state-conservative...

R Zhou, T Liu, M Cheng, D Kalathil… - Advances in neural …, 2024 - proceedings.neurips.cc

We study robust reinforcement learning (RL) with the goal of determining a well-performing
policy that is robust against model mismatch between the training simulator and the testing …

被引用次数：13 相关文章所有 7 个版本

[PDF] neurips.cc

Double pessimism is provably efficient for distributionally robust offline reinforcement learning: Generic algorithm and robust partial coverage

J Blanchet, M Lu, T Zhang… - Advances in Neural …, 2024 - proceedings.neurips.cc

We study distributionally robust offline reinforcement learning (RL), which seeks to find an
optimal robust policy purely from an offline dataset that can perform well in perturbed …

被引用次数：22 相关文章所有 7 个版本

[PDF] arxiv.org

Learning cut selection for mixed-integer linear programming via hierarchical sequence model

Z Wang, X Li, J Wang, Y Kuang, M Yuan, J Zeng… - arXiv preprint arXiv …, 2023 - arxiv.org

Cutting planes (cuts) are important for solving mixed-integer linear programs (MILPs), which
formulate a wide range of important real-world applications. Cut selection--which aims to …

被引用次数：35 相关文章所有 3 个版本

[PDF] neurips.cc

Adjustable robust reinforcement learning for online 3d bin packing

Y Pan, Y Chen, F Lin - Advances in Neural Information …, 2023 - proceedings.neurips.cc

Designing effective policies for the online 3D bin packing problem (3D-BPP) has been a
long-standing challenge, primarily due to the unpredictable nature of incoming box …

被引用次数：6 相关文章所有 5 个版本

[PDF] aaai.org

Learning to stop cut generation for efficient mixed-integer linear programming

H Ling, Z Wang, J Wang - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Cutting planes (cuts) play an important role in solving mixed-integer linear programs
(MILPs), as they significantly tighten the dual bounds and improve the solving performance …

被引用次数：3 相关文章所有 3 个版本

UAV air combat autonomous trajectory planning method based on robust adversarial reinforcement learning

L Wang, S Zheng, S Tai, H Liu, T Yue - Aerospace Science and Technology, 2024 - Elsevier

The poor robustness of the air combat autonomous trajectory planning strategy (ATP)
trained through vanilla reinforcement learning (RL) methods is attributed to its dependence …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Provable sim-to-real transfer in continuous domain with partial observations

J Hu, H Zhong, C Jin, L Wang - arXiv preprint arXiv:2210.15598, 2022 - arxiv.org

Sim-to-real transfer trains RL agents in the simulated environments and then deploys them
in the real world. Sim-to-real transfer has been widely used in practice because it is often …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Minimax optimal and computationally efficient algorithms for distributionally robust offline reinforcement learning

Z Liu, P Xu - arXiv preprint arXiv:2403.09621, 2024 - arxiv.org

Distributionally robust offline reinforcement learning (RL), which seeks robust policy training
against environment perturbation by modeling dynamics uncertainty, calls for function …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Optimal transport perturbations for safe reinforcement learning with robustness guarantees

J Queeney, EC Ozcan, IC Paschalidis… - arXiv preprint arXiv …, 2023 - arxiv.org

Robustness and safety are critical for the trustworthy deployment of deep reinforcement
learning in real-world decision making applications. In particular, we require algorithms that …

被引用次数：3 相关文章所有 7 个版本

[HTML] sciencedirect.com

[HTML][HTML] Tube-based robust reinforcement learning for autonomous maneuver decision for UCAVs

W Lixin, S ZHENG, P Haiyin, LU Changqian… - Chinese Journal of …, 2024 - Elsevier

Reinforcement Learning (RL) algorithms enhance intelligence of air combat Autonomous
Maneuver Decision (AMD) policy, but they may underperform in target combat environments …

高级搜索

QQ 群