Deep reinforcement learning for autonomous internet of things: Model, applications and challenges

L Lei, Y Tan, K Zheng, S Liu, K Zhang… - … Surveys & Tutorials, 2020 - ieeexplore.ieee.org
The Internet of Things (IoT) extends the Internet connectivity into billions of IoT devices
around the world, where the IoT devices collect and share information to reflect status of the …

[HTML][HTML] Reinforcement learning for wind-farm flow control: Current state and future actions

M Abkar, N Zehtabiyan-Rezaie, A Iosifidis - Theoretical and Applied …, 2023 - Elsevier
Wind-farm flow control stands at the forefront of grand challenges in wind-energy science.
The central issue is that current algorithms are based on simplified models and, thus, fall …

[HTML][HTML] Robotic assembly of timber joints using reinforcement learning

AA Apolinarska, M Pacher, H Li, N Cote… - Automation in …, 2021 - Elsevier
In architectural construction, automated robotic assembly is challenging due to occurring
tolerances, small series production and complex contact situations, especially in assembly …

Texplore: real-time sample-efficient reinforcement learning for robots

T Hester, P Stone - Machine learning, 2013 - Springer
The use of robots in society could be expanded by using reinforcement learning (RL) to
allow robots to learn and adapt to new situations online. RL is a paradigm for learning …

Delay-aware model-based reinforcement learning for continuous control

B Chen, M Xu, L Li, D Zhao - Neurocomputing, 2021 - Elsevier
Action delays degrade the performance of reinforcement learning in many real-world
systems. This paper proposes a formal definition of delay-aware Markov Decision Process …

Reinforcement learning with random delays

Y Bouteiller, S Ramstedt, G Beltrame… - International …, 2020 - openreview.net
Action and observation delays commonly occur in many Reinforcement Learning
applications, such as remote control scenarios. We study the anatomy of randomly delayed …

Zero time waste: Recycling predictions in early exit neural networks

M Wołczyk, B Wójcik, K Bałazy… - Advances in …, 2021 - proceedings.neurips.cc
The problem of reducing processing time of large deep learning models is a fundamental
challenge in many real-world applications. Early exit methods strive towards this goal by …

Near-optimal regret for adversarial mdp with delayed bandit feedback

T Jin, T Lancewicki, H Luo… - Advances in Neural …, 2022 - proceedings.neurips.cc
The standard assumption in reinforcement learning (RL) is that agents observe feedback for
their actions immediately. However, in practice feedback is often observed in delay. This …

Learning long-term reward redistribution via randomized return decomposition

Z Ren, R Guo, Y Zhou, J Peng - arXiv preprint arXiv:2111.13485, 2021 - arxiv.org
Many practical applications of reinforcement learning require agents to learn from sparse
and delayed rewards. It challenges the ability of agents to attribute their actions to future …

Revisiting state augmentation methods for reinforcement learning with stochastic delays

S Nath, M Baranwal, H Khadilkar - Proceedings of the 30th ACM …, 2021 - dl.acm.org
Several real-world scenarios, such as remote control and sensing, are comprised of action
and observation delays. The presence of delays degrades the performance of reinforcement …