Adaptive contention window design using deep Q-learning

A Kumar, G Verma, C Rao, A Swami… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
We study the problem of adaptive contention window (CW) design for random-access
wireless networks. More precisely, our goal is to design an intelligent node that can …

GrGym: When GNU radio goes to (AI) gym

A Zubow, S Rösler, P Gawłowicz… - Proceedings of the 22nd …, 2021 - dl.acm.org
Trends like softwarization through the usage of flexible Software-defined Radio (SDR)
platforms together with the usage of Machine Learning (ML) techniques are key enablers for …

Learning self-imitating diverse policies

T Gangwani, Q Liu, J Peng - arXiv preprint arXiv:1805.10309, 2018 - arxiv.org
The success of popular algorithms for deep reinforcement learning, such as policy-gradients
and Q-learning, relies heavily on the availability of an informative reward signal at each …

Fast learning for dynamic resource allocation in AI-enabled radio networks

MA Qureshi, C Tekin - IEEE Transactions on Cognitive …, 2019 - ieeexplore.ieee.org
Artificial Intelligence (AI)-enabled radios are expected to enhance the spectral efficiency of
5th generation (5G) millimeter wave (mmWave) networks by learning to optimize network …

A new convergent variant of Q-learning with linear function approximation

D Carvalho, FS Melo, P Santos - Advances in Neural …, 2020 - proceedings.neurips.cc
In this work, we identify a novel set of conditions that ensure convergence with probability 1
of Q-learning with linear function approximation, by proposing a two time-scale variation …

Batch reinforcement learning in a complex domain

S Kalyanakrishnan, P Stone - Proceedings of the 6th international joint …, 2007 - dl.acm.org
Temporal difference reinforcement learning algorithms are perfectly suited to autonomous
agents because they learn directly from an agent's experience based on sequential actions …

Feed-forward network training using optimal input gains

SS Malalur, M Manry - … Networks, IEEE-INNS-ENNS International Joint …, 2009 - computer.org
We study the problem of joint congestion control and scheduling in wireless networks. We
model the wireless network as a directed graph G=(V, E), where V denotes the set of nodes …

Discor: Corrective feedback in reinforcement learning via distribution correction

A Kumar, A Gupta, S Levine - Advances in Neural …, 2020 - proceedings.neurips.cc
Deep reinforcement learning can learn effective policies for a wide range of tasks, but is
notoriously difficult to use due to instability and sensitivity to hyperparameters. The reasons …

Deep reinforcement learning paradigm for dense wireless networks in smart cities

R Ali, YB Zikria, BS Kim, SW Kim - Smart cities performability, cognition, & …, 2020 - Springer
Wireless local area networks (WLANs) are widely deployed for Internet-centric data
applications. Due to their extensive norm in our day-to-day wireless-enabled life, WLANs are …

Online target q-learning with reverse experience replay: Efficiently finding the optimal policy for linear mdps

N Agarwal, S Chaudhuri, P Jain, D Nagaraj… - arXiv preprint arXiv …, 2021 - arxiv.org
Q-learning is a popular Reinforcement Learning (RL) algorithm which is widely used in
practice with function approximation (Mnih et al., 2015). In contrast, existing theoretical …