相关文章- 学术资源搜索

Adversarial motion priors make good substitutes for complex reward functions

A Escontrela, XB Peng, W Yu, T Zhang… - 2022 IEEE/RSJ …, 2022 - ieeexplore.ieee.org

Training a high-dimensional simulated agent with an under-specified reward function often
leads the agent to learn physically infeasible strategies that are ineffective when deployed in …

被引用次数：68 相关文章所有 5 个版本

[PDF] arxiv.org

Learning human behaviors from motion capture by adversarial imitation

J Merel, Y Tassa, D TB, S Srinivasan, J Lemmon… - arXiv preprint arXiv …, 2017 - arxiv.org

Rapid progress in deep reinforcement learning has made it increasingly feasible to train
controllers for high-dimensional humanoid bodies. However, methods that use pure …

被引用次数：229 相关文章所有 4 个版本

[PDF] arxiv.org

Advanced skills through multiple adversarial motion priors in reinforcement learning

E Vollenweider, M Bjelonic, V Klemm… - … on Robotics and …, 2023 - ieeexplore.ieee.org

Reinforcement learning (RL) has emerged as a powerful approach for locomotion control of
highly articulated robotic systems. However, one major challenge is the tedious process of …

被引用次数：58 相关文章所有 5 个版本

[PDF] arxiv.org

Generalizing skills with semi-supervised reinforcement learning

C Finn, T Yu, J Fu, P Abbeel, S Levine - arXiv preprint arXiv:1612.00429, 2016 - arxiv.org

Deep reinforcement learning (RL) can acquire complex behaviors from low-level inputs,
such as images. However, real-world applications of such methods require generalizing to …

被引用次数：83 相关文章所有 4 个版本

[PDF] arxiv.org

Data-efficient reinforcement learning with self-predictive representations

M Schwarzer, A Anand, R Goel, RD Hjelm… - arXiv preprint arXiv …, 2020 - arxiv.org

While deep reinforcement learning excels at solving tasks where large amounts of data can
be collected through virtually unlimited interaction with the environment, learning from …

被引用次数：284 相关文章所有 8 个版本

[PDF] arxiv.org

Unsupervised perceptual rewards for imitation learning

P Sermanet, K Xu, S Levine - arXiv preprint arXiv:1612.06699, 2016 - arxiv.org

Reward function design and exploration time are arguably the biggest obstacles to the
deployment of reinforcement learning (RL) agents in the real world. In many real-world …

被引用次数：178 相关文章所有 13 个版本

[PDF] thecvf.com

Rl-cyclegan: Reinforcement learning aware simulation-to-real

K Rao, C Harris, A Irpan, S Levine… - Proceedings of the …, 2020 - openaccess.thecvf.com

Deep neural network based reinforcement learning (RL) can learn appropriate visual
representations for complex tasks like vision-based robotic grasping without the need for …

被引用次数：177 相关文章所有 7 个版本

[PDF] arxiv.org

Inverse reinforcement learning for video games

A Tucker, A Gleave, S Russell - arXiv preprint arXiv:1810.10593, 2018 - arxiv.org

Deep reinforcement learning achieves superhuman performance in a range of video game
environments, but requires that a designer manually specify a reward function. It is often …

被引用次数：55 相关文章所有 7 个版本

[PDF] neurips.cc

Robust imitation of diverse behaviors

Z Wang, JS Merel, SE Reed… - Advances in …, 2017 - proceedings.neurips.cc

Deep generative models have recently shown great promise in imitation learning for motor
control. Given enough data, even supervised approaches can do one-shot imitation …

被引用次数：240 相关文章所有 7 个版本

[PDF] arxiv.org

Adversarial policies: Attacking deep reinforcement learning

A Gleave, M Dennis, C Wild, N Kant, S Levine… - arXiv preprint arXiv …, 2019 - arxiv.org

Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial
perturbations to their observations, similar to adversarial examples for classifiers. However …

被引用次数：407 相关文章所有 10 个版本

高级搜索

QQ 群

Adversarial motion priors make good substitutes for complex reward functions

Learning human behaviors from motion capture by adversarial imitation

Advanced skills through multiple adversarial motion priors in reinforcement learning

Generalizing skills with semi-supervised reinforcement learning

Data-efficient reinforcement learning with self-predictive representations

Unsupervised perceptual rewards for imitation learning

Rl-cyclegan: Reinforcement learning aware simulation-to-real

Inverse reinforcement learning for video games

Robust imitation of diverse behaviors

Adversarial policies: Attacking deep reinforcement learning

相关搜索

引用