Exploration in deep reinforcement learning: From single-agent to multiagent domain

M Wen, R Lin, H Wang, Y Yang, Y Wen, L Mai… - Frontiers of Computer …, 2023 - Springer

Transformer architectures have facilitated the development of large-scale and general-
purpose sequence models for prediction tasks in natural language processing and computer …

被引用次数：18 相关文章所有 6 个版本

[PDF] dtu.dk

The state of AI-empowered backscatter communications: A comprehensive survey

F Xu, T Hussain, M Ahmed, K Ali… - IEEE Internet of …, 2023 - ieeexplore.ieee.org

The Internet of Things (IoT) is undergoing significant advancements, driven by the
emergence of backscatter communication (BC) and artificial intelligence (AI). BC is an …

被引用次数：18 相关文章所有 3 个版本

[PDF] semanticscholar.org

[PDF][PDF] Theoretical approaches to AI in supply chain optimization: Pathways to efficiency and resilience

EA Abaku, TE Edunjobi… - International Journal of …, 2024 - pdfs.semanticscholar.org

Abstract The integration of Artificial Intelligence (AI) into supply chain management has
emerged as a pivotal avenue for enhancing efficiency and resilience in contemporary …

被引用次数：63 相关文章所有 2 个版本

Energy management for demand response in networked greenhouses with multi-agent deep reinforcement learning

A Ajagekar, B Decardi-Nelson, F You - Applied Energy, 2024 - Elsevier

Greenhouses are key to ensuring food security and realizing a sustainable future for
agriculture. However, to ensure crop growth efficiency, greenhouses consume a significant …

被引用次数：9 相关文章所有 5 个版本

[PDF] neurips.cc

Small batch deep reinforcement learning

J Obando Ceron, M Bellemare… - Advances in Neural …, 2024 - proceedings.neurips.cc

In value-based deep reinforcement learning with replay memories, the batch size parameter
specifies how many transitions to sample for each gradient update. Although critical to the …

被引用次数：3 相关文章所有 5 个版本

[PDF] mlr.press

Behavior contrastive learning for unsupervised skill discovery

R Yang, C Bai, H Guo, S Li, B Zhao… - International …, 2023 - proceedings.mlr.press

In reinforcement learning, unsupervised skill discovery aims to learn diverse skills without
extrinsic rewards. Previous methods discover skills by maximizing the mutual information …

被引用次数：7 相关文章所有 6 个版本

[PDF] arxiv.org

Semantically aligned task decomposition in multi-agent reinforcement learning

W Li, D Qiao, B Wang, X Wang, B Jin, H Zha - arXiv preprint arXiv …, 2023 - arxiv.org

The difficulty of appropriately assigning credit is particularly heightened in cooperative
MARL with sparse reward, due to the concurrent time and structural scales involved …

被引用次数：13 相关文章所有 3 个版本

Physics-informed deep reinforcement learning for enhancement on tunnel boring machine's advance speed and stability

P Lin, M Wu, Z Xiao, RLK Tiong, L Zhang - Automation in Construction, 2024 - Elsevier

The traditional mode of Tunnel Boring Machine (TBM) operation is limited in their
applicability and efficiency to meet the growing demand for underground spaces. Current …

被引用次数：3 相关文章

[PDF] github.io

When to switch: planning and learning for partially observable multi-agent pathfinding

A Skrynnik, A Andreychuk, K Yakovlev… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Multi-agent pathfinding (MAPF) is a problem that involves finding a set of non-conflicting
paths for a set of agents confined to a graph. In this work, we study a MAPF setting, where …

被引用次数：6 相关文章所有 5 个版本

[PDF] aaai.org

Ovd-explorer: Optimism should not be the sole pursuit of exploration in noisy environments

J Liu, Z Wang, Y Zheng, J Hao, C Bai, J Ye… - Proceedings of the …, 2024 - ojs.aaai.org

In reinforcement learning, the optimism in the face of uncertainty (OFU) is a mainstream
principle for directing exploration towards less explored areas, characterized by higher …

被引用次数：3 相关文章所有 3 个版本

高级搜索

QQ 群