相关文章- 学术资源搜索

Adaptive contention window design using deep Q-learning

A Kumar, G Verma, C Rao, A Swami… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

We study the problem of adaptive contention window (CW) design for random-access
wireless networks. More precisely, our goal is to design an intelligent node that can …

被引用次数：38 相关文章所有 4 个版本

[PDF] ccs-labs.org

GrGym: When GNU radio goes to (AI) gym

A Zubow, S Rösler, P Gawłowicz… - Proceedings of the 22nd …, 2021 - dl.acm.org

Trends like softwarization through the usage of flexible Software-defined Radio (SDR)
platforms together with the usage of Machine Learning (ML) techniques are key enablers for …

被引用次数：11 相关文章所有 5 个版本

[PDF] arxiv.org

Learning self-imitating diverse policies

T Gangwani, Q Liu, J Peng - arXiv preprint arXiv:1805.10309, 2018 - arxiv.org

The success of popular algorithms for deep reinforcement learning, such as policy-gradients
and Q-learning, relies heavily on the availability of an informative reward signal at each …

被引用次数：63 相关文章所有 8 个版本

[PDF] academia.edu

Fast learning for dynamic resource allocation in AI-enabled radio networks

MA Qureshi, C Tekin - IEEE Transactions on Cognitive …, 2019 - ieeexplore.ieee.org

Artificial Intelligence (AI)-enabled radios are expected to enhance the spectral efficiency of
5th generation (5G) millimeter wave (mmWave) networks by learning to optimize network …

被引用次数：33 相关文章所有 7 个版本

[PDF] neurips.cc

A new convergent variant of Q-learning with linear function approximation

D Carvalho, FS Melo, P Santos - Advances in Neural …, 2020 - proceedings.neurips.cc

In this work, we identify a novel set of conditions that ensure convergence with probability 1
of Q-learning with linear function approximation, by proposing a two time-scale variation …

被引用次数：31 相关文章所有 5 个版本

[PDF] psu.edu

Batch reinforcement learning in a complex domain

S Kalyanakrishnan, P Stone - Proceedings of the 6th international joint …, 2007 - dl.acm.org

Temporal difference reinforcement learning algorithms are perfectly suited to autonomous
agents because they learn directly from an agent's experience based on sequential actions …

被引用次数：114 相关文章所有 10 个版本

Feed-forward network training using optimal input gains

SS Malalur, M Manry - … Networks, IEEE-INNS-ENNS International Joint …, 2009 - computer.org

We study the problem of joint congestion control and scheduling in wireless networks. We
model the wireless network as a directed graph G=(V, E), where V denotes the set of nodes …

被引用次数：16 相关文章所有 4 个版本

[PDF] neurips.cc

Discor: Corrective feedback in reinforcement learning via distribution correction

A Kumar, A Gupta, S Levine - Advances in Neural …, 2020 - proceedings.neurips.cc

Deep reinforcement learning can learn effective policies for a wide range of tasks, but is
notoriously difficult to use due to instability and sensitivity to hyperparameters. The reasons …

被引用次数：108 相关文章所有 7 个版本

[PDF] researchgate.net

Deep reinforcement learning paradigm for dense wireless networks in smart cities

R Ali, YB Zikria, BS Kim, SW Kim - Smart cities performability, cognition, & …, 2020 - Springer

Wireless local area networks (WLANs) are widely deployed for Internet-centric data
applications. Due to their extensive norm in our day-to-day wireless-enabled life, WLANs are …

被引用次数：19 相关文章所有 5 个版本

[PDF] arxiv.org

Online target q-learning with reverse experience replay: Efficiently finding the optimal policy for linear mdps

N Agarwal, S Chaudhuri, P Jain, D Nagaraj… - arXiv preprint arXiv …, 2021 - arxiv.org

Q-learning is a popular Reinforcement Learning (RL) algorithm which is widely used in
practice with function approximation (Mnih et al., 2015). In contrast, existing theoretical …

被引用次数：23 相关文章所有 5 个版本

高级搜索

QQ 群