Critic regularized regression- 学术资源搜索

Critic regularized regression

Z Wang, A Novikov, K Zolna, JS Merel… - Advances in …, 2020 - proceedings.neurips.cc

… call Critic Weighted Policy (CWP) that uses the learned critic to improve results at test time. …
Our implementation of ABM is thus equipped with distributional critics and can use recurrent …

被引用次数：347 相关文章所有 9 个版本

[PDF] mlr.press

A connection between one-step RL and critic regularization in reinforcement learning

B Eysenbach, M Geist, S Levine… - International …, 2023 - proceedings.mlr.press

… critic regularization, under some assumptions (see Fig. 1). The key idea is that, when using a
certain TD loss, the regularized critic … Q-values can be learned via regression, we will use a …

被引用次数：5 相关文章所有 5 个版本

[PDF] arxiv.org

A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning

B Eysenbach, M Geist, S Levine… - arXiv preprint arXiv …, 2023 - arxiv.org

被引用次数：2 相关文章所有 4 个版本

Actor–critic learning control with regularization and feature selection in policy gradient estimation

L Li, D Li, T Song, X Xu - IEEE Transactions on Neural …, 2020 - ieeexplore.ieee.org

… angle regression to solve the constructed 1-regularization problem. In [… regularized
actor and critic also have the same NNZs in actor parameters because we set both of 1-regularization …

被引用次数：20 相关文章所有 3 个版本

[PDF] arxiv.org

Neural Stein critics with staged -regularization

M Repasky, X Cheng, Y Xie - arXiv preprint arXiv:2207.03406, 2022 - arxiv.org

… can be approximately understood from the perspective of kernel regression optimization,
rapidly reaching its best approximation in ∼ 1/λ time (up to a log factor), see Theorem 4.6 (and …

被引用次数：3 相关文章所有 2 个版本

[PDF] arxiv.org

Idql: Implicit q-learning as an actor-critic method with diffusion policies

P Hansen-Estruch, I Kostrikov, M Janner… - arXiv preprint arXiv …, 2023 - arxiv.org

… -critic method by generalizing the critic objective and connecting it to a behavior-regularized …
EMaQ [15] defines a policy using an autoregressive behavioral cloning model, using the …

被引用次数：108 相关文章所有 4 个版本

Actor-Critic Learning Control Based on -Regularized Temporal-Difference Prediction With Gradient Correction

L Li, D Li, T Song, X Xu - IEEE Transactions on Neural …, 2018 - ieeexplore.ieee.org

… In the perspective of regularization, this initial coefficient can realize 2-regularization (also
mentioned as ridge regression). Thus, the objective function is combined by a mean-square-…

被引用次数：16 相关文章所有 3 个版本

[PDF] springer.com

TD-regularized actor-critic methods

S Parisi, V Tangkaratt, J Peters, ME Khan - Machine Learning, 2019 - Springer

… To avoid such issues, we propose to regularize the learning objective of the actor by … of the
critic. This improves stability by avoiding large steps in the actor update whenever the critic is …

被引用次数：49 相关文章所有 15 个版本

Neural Stein Critics With Staged L²-Regularization

M Repasky, X Cheng, Y Xie - IEEE Transactions on Information …, 2023 - ieeexplore.ieee.org

… $L^{2}$ regularization in training a neural network Stein critic so as to distinguish
between … of regularization over training time, which leverages the advantages of highly-regularized …

lassopack: Model selection and prediction with regularized regression in Stata

A Ahrens, CB Hansen, ME Schaffer - The Stata Journal, 2020 - journals.sagepub.com

… regularized regression, as with supervised machine-learning methods more generally, is
prediction. Regularized regression … While regularized regression may select the true model as …

被引用次数：257 相关文章所有 30 个版本

高级搜索

QQ 群