WCSAC: Worst-case soft actor critic for safety-constrained reinforcement learning Q Yang, TD Simão, SH Tindemans, MTJ Spaan AAAI 2021, 2021 | 120 | 2021 |
Safety-constrained reinforcement learning with a distributional safety critic Q Yang, TD Simão, SH Tindemans, MTJ Spaan Machine Learning 112 (3), 859–887, 2023 | 40 | 2023 |
Cem: Constrained entropy maximization for task-agnostic safe exploration Q Yang, MTJ Spaan AAAI 2023, 2023 | 12 | 2023 |
Reinforcement Learning by Guided Safe Exploration Q Yang, TD Simão, N Jansen, SH Tindemans, MTJ Spaan ECAI 2023, 2023 | 10* | 2023 |
A Modern Perspective on Safe Automated Driving for Different Traffic Dynamics using Constrained Reinforcement Learning D Kamran, TD Simão, Q Yang, CT Ponnambalam, J Fischer, MTJ Spaan, ... ITSC 2022, 2022 | 9 | 2022 |
Exploring the use of invalid action masking in reinforcement learning: A comparative study of on-policy and off-policy algorithms in real-time strategy games Y Hou, X Liang, J Zhang, Q Yang, A Yang, N Wang Applied Sciences 13 (14), 8283, 2023 | 4 | 2023 |
Novel second-order sliding mode control based 3D guidance law with impact angle constraints S Shi, J Zhao, Y Chong, Q Yang, H You 北京航空航天大学学报 45 (3), 614-623, 2019 | 4* | 2019 |
General Optimal Trajectory Planning: Enabling Autonomous Vehicles with the Principle of Least Action H Huang, Y Liu, J Liu, Q Yang, J Wang, D Abbink, A Zgonnikov Engineering 33, 63-76, 2024 | 3 | 2024 |
Safe adaptive policy transfer reinforcement learning for distributed multiagent control B Du, W Xie, Y Li, Q Yang, W Zhang, RR Negenborn, Y Pang, H Chen IEEE Transactions on Neural Networks and Learning Systems, 2023 | 3 | 2023 |
Subtask-masked curriculum learning for reinforcement learning with application to UAV maneuver decision-making Y Hou, X Liang, M Lv, Q Yang, Y Li Engineering Applications of Artificial Intelligence 125, 106703, 2023 | 1 | 2023 |
视场角限制下导弹协同攻击导引律设计. 赵久奋, 史绍琨, 尤浩, 杨奇松 Journal of National University of Defense Technology/Guofang Keji Daxue …, 2019 | 1 | 2019 |
分阶段对地打击武器-目标分配建模与决策 杨奇松, 王顺宏, 王然辉, 牛晓洁 弹道学报 29 (2), 90-96, 2017 | 1 | 2017 |
基于导航点改进 Gauss 伪谱法规划滑翔导弹航迹 牛晓洁, 李邦杰, 舒健生, 潘乐飞, 杨奇松 弹道学报 28 (4), 36-41, 2016 | 1 | 2016 |
Analyzing Generalization in Policy Networks: A Case Study with the Double-Integrator System R Zhang, H Han, M Lv, Q Yang, J Cheng AAAI 2024, 2024 | | 2024 |
Risk Aversion and Guided Exploration in Safety-Constrained Reinforcement Learning Q Yang Delft University of Technology, 2023 | | 2023 |
Refined Risk Management in Safe Reinforcement Learning with a Distributional Safety Critic Q Yang, TD Simão, SH Tindemans, MTJ Spaan International Workshop on Safe Reinforcement Learning, 2022 | | 2022 |
跳跃-滑翔弹道扰动引力自适应网格快速赋值方法. 王顺宏, 戴陈超, 李剑, 杨奇松 Journal of National University of Defense Technology/Guofang Keji Daxue …, 2019 | | 2019 |