N Xiong, Y Du, L Huang - arXiv preprint arXiv:2302.06064, 2023 - arxiv.org
In this paper, we investigate a novel safe reinforcement learning problem with step-wise
violation constraints. Our problem differs from existing works in that we consider stricter step …