- 学术资源搜索

A review of deep reinforcement learning approaches for smart manufacturing in industry 4.0 and 5.0 framework

A del Real Torres, DS Andreiana, Á Ojeda Roldán… - Applied Sciences, 2022 - mdpi.com

In this review, the industry's current issues regarding intelligent manufacture are presented.
This work presents the status and the potential for the I4. 0 and I5. 0's revolutionary …

被引用次数：54 相关文章所有 5 个版本

[PDF] neurips.cc

Pre-training contextualized world models with in-the-wild videos for reinforcement learning

J Wu, H Ma, C Deng, M Long - Advances in Neural …, 2024 - proceedings.neurips.cc

Unsupervised pre-training methods utilizing large and diverse datasets have achieved
tremendous success across a range of domains. Recent work has investigated such …

被引用次数：26 相关文章所有 7 个版本

[PDF] neurips.cc

Redeeming intrinsic rewards via constrained optimization

E Chen, ZW Hong, J Pajarinen… - Advances in Neural …, 2022 - proceedings.neurips.cc

State-of-the-art reinforcement learning (RL) algorithms typically use random sampling (eg,
$\epsilon $-greedy) for exploration, but this method fails on hard exploration tasks like …

被引用次数：36 相关文章所有 9 个版本

[PDF] neurips.cc

Accelerating reinforcement learning with value-conditional state entropy exploration

D Kim, J Shin, P Abbeel, Y Seo - Advances in Neural …, 2024 - proceedings.neurips.cc

A promising technique for exploration is to maximize the entropy of visited state distribution,
ie, state entropy, by encouraging uniform coverage of visited state space. While it has been …

被引用次数：13 相关文章所有 5 个版本

[PDF] neurips.cc

Conditional mutual information for disentangled representations in reinforcement learning

M Dunion, T McInroe, KS Luck… - Advances in Neural …, 2024 - proceedings.neurips.cc

Reinforcement Learning (RL) environments can produce training data with spurious
correlations between features due to the amount of training data or its limited feature …

被引用次数：14 相关文章所有 7 个版本

[PDF] pnas.org Full View

Having multiple selves helps learning agents explore and adapt in complex changing worlds

Z Dulberg, R Dubey, IM Berwian… - Proceedings of the …, 2023 - National Acad Sciences

Satisfying a variety of conflicting needs in a changing environment is a fundamental
challenge for any adaptive agent. Here, we show that designing an agent in a modular …

被引用次数：12 相关文章所有 11 个版本

[PDF] arxiv.org

Intrinsic language-guided exploration for complex long-horizon robotic manipulation tasks

E Triantafyllidis, F Christianos… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Current reinforcement learning algorithms struggle in sparse and complex environments,
most notably in long-horizon manipulation tasks entailing a plethora of different sequences …

被引用次数：8 相关文章所有 3 个版本

[PDF] arxiv.org Full View

Deep reinforcement learning for multi-agent interaction

IH Ahmed, C Brewitt, I Carlucho… - Ai …, 2022 - content.iospress.com

The development of autonomous agents which can interact with other agents to accomplish
a given task is a core area of research in artificial intelligence and machine learning …

被引用次数：15 相关文章所有 6 个版本

[PDF] arxiv.org

Ensemble value functions for efficient exploration in multi-agent reinforcement learning

L Schäfer, O Slumbers, S McAleer, Y Du… - arXiv preprint arXiv …, 2023 - arxiv.org

Cooperative multi-agent reinforcement learning (MARL) requires agents to explore to learn
to cooperate. Existing value-based MARL algorithms commonly rely on random exploration …

被引用次数：6 相关文章所有 4 个版本

[PDF] arxiv.org

Planning to go out-of-distribution in offline-to-online reinforcement learning

T McInroe, A Jelley, SV Albrecht, A Storkey - arXiv preprint arXiv …, 2023 - arxiv.org

Offline pretraining with a static dataset followed by online fine-tuning (offline-to-online, or
OtO) is a paradigm well matched to a real-world RL deployment process. In this scenario, we …

被引用次数：4 相关文章所有 3 个版本

高级搜索

QQ 群

A review of deep reinforcement learning approaches for smart manufacturing in industry 4.0 and 5.0 framework

Pre-training contextualized world models with in-the-wild videos for reinforcement learning

Redeeming intrinsic rewards via constrained optimization

Accelerating reinforcement learning with value-conditional state entropy exploration

Conditional mutual information for disentangled representations in reinforcement learning

Having multiple selves helps learning agents explore and adapt in complex changing worlds

Intrinsic language-guided exploration for complex long-horizon robotic manipulation tasks

Deep reinforcement learning for multi-agent interaction

Ensemble value functions for efficient exploration in multi-agent reinforcement learning

Planning to go out-of-distribution in offline-to-online reinforcement learning

引用