Large sequence models for sequential decision-making: a survey

M Wen, R Lin, H Wang, Y Yang, Y Wen, L Mai… - Frontiers of Computer …, 2023 - Springer
Transformer architectures have facilitated the development of large-scale and general-
purpose sequence models for prediction tasks in natural language processing and computer …

The state of AI-empowered backscatter communications: A comprehensive survey

F Xu, T Hussain, M Ahmed, K Ali… - IEEE Internet of …, 2023 - ieeexplore.ieee.org
The Internet of Things (IoT) is undergoing significant advancements, driven by the
emergence of backscatter communication (BC) and artificial intelligence (AI). BC is an …

[PDF][PDF] Theoretical approaches to AI in supply chain optimization: Pathways to efficiency and resilience

EA Abaku, TE Edunjobi… - International Journal of …, 2024 - pdfs.semanticscholar.org
Abstract The integration of Artificial Intelligence (AI) into supply chain management has
emerged as a pivotal avenue for enhancing efficiency and resilience in contemporary …

Energy management for demand response in networked greenhouses with multi-agent deep reinforcement learning

A Ajagekar, B Decardi-Nelson, F You - Applied Energy, 2024 - Elsevier
Greenhouses are key to ensuring food security and realizing a sustainable future for
agriculture. However, to ensure crop growth efficiency, greenhouses consume a significant …

Small batch deep reinforcement learning

J Obando Ceron, M Bellemare… - Advances in Neural …, 2024 - proceedings.neurips.cc
In value-based deep reinforcement learning with replay memories, the batch size parameter
specifies how many transitions to sample for each gradient update. Although critical to the …

Behavior contrastive learning for unsupervised skill discovery

R Yang, C Bai, H Guo, S Li, B Zhao… - International …, 2023 - proceedings.mlr.press
In reinforcement learning, unsupervised skill discovery aims to learn diverse skills without
extrinsic rewards. Previous methods discover skills by maximizing the mutual information …

Semantically aligned task decomposition in multi-agent reinforcement learning

W Li, D Qiao, B Wang, X Wang, B Jin, H Zha - arXiv preprint arXiv …, 2023 - arxiv.org
The difficulty of appropriately assigning credit is particularly heightened in cooperative
MARL with sparse reward, due to the concurrent time and structural scales involved …

Physics-informed deep reinforcement learning for enhancement on tunnel boring machine's advance speed and stability

P Lin, M Wu, Z Xiao, RLK Tiong, L Zhang - Automation in Construction, 2024 - Elsevier
The traditional mode of Tunnel Boring Machine (TBM) operation is limited in their
applicability and efficiency to meet the growing demand for underground spaces. Current …

When to switch: planning and learning for partially observable multi-agent pathfinding

A Skrynnik, A Andreychuk, K Yakovlev… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Multi-agent pathfinding (MAPF) is a problem that involves finding a set of non-conflicting
paths for a set of agents confined to a graph. In this work, we study a MAPF setting, where …

Ovd-explorer: Optimism should not be the sole pursuit of exploration in noisy environments

J Liu, Z Wang, Y Zheng, J Hao, C Bai, J Ye… - Proceedings of the …, 2024 - ojs.aaai.org
In reinforcement learning, the optimism in the face of uncertainty (OFU) is a mainstream
principle for directing exploration towards less explored areas, characterized by higher …