AG Prawiyogi, S Purnama… - … Transactions on Artificial …, 2022 - journal.pandawan.id
The goal of smart cities is to properly manage to expand urbanization, Reduce energy usage, Enhance the economic and quality of life of the locals while also preserving the …
B Hu, U Syed - Advances in neural information processing …, 2019 - proceedings.neurips.cc
In this paper, we provide a unified analysis of temporal difference learning algorithms with linear function approximators by exploiting their connections to Markov jump linear systems …
Motivated by the emerging use of multi-agent reinforcement learning (MARL) in engineering applications such as networked robotics, swarming drones, and sensor networks, we …
S Qiu, Z Yang, X Wei, J Ye, Z Wang - arXiv preprint arXiv:2008.10103, 2020 - arxiv.org
Temporal-Difference (TD) learning with nonlinear smooth function approximation for policy evaluation has achieved great success in modern reinforcement learning. It is shown that …
Temporal difference (TD) learning is a popular algorithm for policy evaluation in reinforcement learning, but the vanilla TD can substantially suffer from the inherent …
The present contribution deals with decentralized policy evaluation in multi-agent Markov decision processes using temporal-difference (TD) methods with linear function …
H Shen, T Chen - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Stochastic approximation (SA) with multiple coupled sequences has found broad applications in machine learning such as bilevel learning and reinforcement learning (RL) …
Using a martingale concentration inequality, concentration bounds “from time n 0 on” are derived for stochastic approximation algorithms with contractive maps and both martingale …
In this paper, we develop some novel bounds for the Rademacher complexity and the generalization error in deep learning with iid and Markov datasets. The new Rademacher …