Improved empirical methods in reinforcement-learning evaluation

N Jiang, L Li - International conference on machine learning, 2016 - proceedings.mlr.press

We study the problem of off-policy value evaluation in reinforcement learning (RL), where
one aims to estimate the value of a new policy based on data collected by a different policy …

被引用次数：818 相关文章所有 10 个版本

[PDF] usenix.org

Curator:{Self-Managing} Storage for Enterprise Clusters

I Cano, S Aiyar, V Arora, M Bhattacharyya… - … USENIX Symposium on …, 2017 - usenix.org

Modern cluster storage systems perform a variety of background tasks to improve the
performance, availability, durability, and cost-efficiency of stored data. For example, cleaners …

被引用次数：340 相关文章所有 10 个版本

[PDF] arxiv.org

Recurrent reinforcement learning: a hybrid approach

X Li, L Li, J Gao, X He, J Chen, L Deng, J He - arXiv preprint arXiv …, 2015 - arxiv.org

Successful applications of reinforcement learning in real-world problems often require
dealing with partially observable states. It is in general very challenging to construct and …

被引用次数：86 相关文章所有 3 个版本

[PDF] arxiv.org

Iroko: A framework to prototype reinforcement learning for data center traffic control

F Ruffy, M Przystupa, I Beschastnikh - arXiv preprint arXiv:1812.09975, 2018 - arxiv.org

Recent networking research has identified that data-driven congestion control (CC) can be
more efficient than traditional CC in TCP. Deep reinforcement learning (RL), in particular …

被引用次数：39 相关文章所有 5 个版本

[PDF] aaai.org

Arena: A general evaluation platform and building toolkit for multi-agent intelligence

Y Song, A Wojcicki, T Lukasiewicz, J Wang… - Proceedings of the AAAI …, 2020 - aaai.org

Learning agents that are not only capable of taking tests, but also innovating is becoming a
hot topic in AI. One of the most promising paths towards this vision is multi-agent learning …

被引用次数：30 相关文章所有 11 个版本

[PDF] googleapis.com

Multiple-action computational model training and operation

J Chen, L Deng, J Gao, X He, L Li, J He… - US Patent …, 2021 - Google Patents

(57) ABSTRACT A processing unit can determine a first feature value corre sponding to a
session by operating a first network compu tational model (NCM) based part on information …

被引用次数：27 相关文章所有 4 个版本

[PDF] googleapis.com

Multi-model controller

J Gao, L Deng, X He, P Singh, L Li, J Chen… - US Patent …, 2021 - Google Patents

A processing unit can operate a first recurrent computational model (RCM) to provide first
state information and a predicted result value. The processing unit can operating a first …

被引用次数：24 相关文章所有 4 个版本

Application of Deep Reinforcement Learning Methods in Debt Collection

G Kuzmin, AI Panov, I Razvorotnev… - Proceedings of the Fifth …, 2022 - Springer

In the last few years, there is a growing interest in offline reinforcement learning (offline RL)
and in reinforcement learning (RL) in general. In this paper, we presented an example of …

被引用次数：1 相关文章所有 4 个版本

[PDF] ubc.ca

Data-driven data center traffic control

F Ruffy Varga - 2019 - open.library.ubc.ca

Recent networking research has identified that data-driven congestion control (CC) can be
more efficient than traditional CC in data centers (DCs). Deep reinforcement learning (RL) …

被引用次数：1 相关文章所有 4 个版本

高级搜索

QQ 群