Bigger, better, faster: Human-level atari with human-level efficiency

S Dohare, JF Hernandez-Garcia, Q Lan, P Rahman… - Nature, 2024 - nature.com

Artificial neural networks, deep-learning methods and the backpropagation algorithm form
the foundation of modern machine learning and artificial intelligence. These methods are …

被引用次数：36 相关文章所有 2 个版本

[PDF] neurips.cc

STORM: Efficient stochastic transformer based world models for reinforcement learning

W Zhang, G Wang, J Sun, Y Yuan… - Advances in Neural …, 2024 - proceedings.neurips.cc

Recently, model-based reinforcement learning algorithms have demonstrated remarkable
efficacy in visual input environments. These approaches begin by constructing a …

被引用次数：24 相关文章所有 5 个版本

[PDF] neurips.cc

Double gumbel q-learning

DYT Hui, AC Courville… - Advances in Neural …, 2023 - proceedings.neurips.cc

Abstract We show that Deep Neural Networks introduce two heteroscedastic Gumbel noise
sources into Q-Learning. To account for these noise sources, we propose Double Gumbel Q …

被引用次数：8 相关文章所有 3 个版本

On Efficient Training of Large-Scale Deep Learning Models

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - ACM Computing Surveys, 2024 - dl.acm.org

The field of deep learning has witnessed significant progress in recent times, particularly in
areas such as computer vision (CV), natural language processing (NLP), and speech. The …

[PDF] arxiv.org

Maintaining plasticity in deep continual learning

S Dohare, JF Hernandez-Garcia, P Rahman… - arXiv preprint arXiv …, 2023 - arxiv.org

Modern deep-learning systems are specialized to problem settings in which training occurs
once and then never again, as opposed to continual-learning settings in which training …

被引用次数：24 相关文章所有 2 个版本

[PDF] arxiv.org

Drm: Mastering visual reinforcement learning through dormant ratio minimization

G Xu, R Zheng, Y Liang, X Wang, Z Yuan, T Ji… - arXiv preprint arXiv …, 2023 - arxiv.org

Visual reinforcement learning (RL) has shown promise in continuous control tasks. Despite
its progress, current algorithms are still unsatisfactory in virtually every aspect of the …

被引用次数：23 相关文章所有 5 个版本

[PDF] arxiv.org

Overestimation, overfitting, and plasticity in actor-critic: the bitter lesson of reinforcement learning

M Nauman, M Bortkiewicz, P Miłoś, T Trzciński… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advancements in off-policy Reinforcement Learning (RL) have significantly improved
sample efficiency, primarily due to the incorporation of various forms of regularization that …

被引用次数：10 相关文章所有 3 个版本

[PDF] neurips.cc

Small batch deep reinforcement learning

J Obando Ceron, M Bellemare… - Advances in Neural …, 2024 - proceedings.neurips.cc

In value-based deep reinforcement learning with replay memories, the batch size parameter
specifies how many transitions to sample for each gradient update. Although critical to the …

被引用次数：6 相关文章所有 5 个版本

Adaptive collision avoidance decisions in autonomous ship encounter scenarios through rule-guided vision supervised learning

K Zheng, X Zhang, C Wang, Y Li, J Cui, L Jiang - Ocean Engineering, 2024 - Elsevier

Limitations are identified in the expressive capabilities of the deep feature extraction network
employed in deep reinforcement learning (DRL), particularly in complex scenarios …

被引用次数：11 相关文章所有 3 个版本

[PDF] arxiv.org

Amago: Scalable in-context reinforcement learning for adaptive agents

J Grigsby, L Fan, Y Zhu - arXiv preprint arXiv:2310.09971, 2023 - arxiv.org

We introduce AMAGO, an in-context Reinforcement Learning (RL) agent that uses sequence
models to tackle the challenges of generalization, long-term memory, and meta-learning …

被引用次数：20 相关文章所有 3 个版本

高级搜索

QQ 群