Doubly robust off-policy value evaluation for reinforcement learning

N Jiang, L Li - International conference on machine learning, 2016 - proceedings.mlr.press
We study the problem of off-policy value evaluation in reinforcement learning (RL), where
one aims to estimate the value of a new policy based on data collected by a different policy …

Curator:{Self-Managing} Storage for Enterprise Clusters

I Cano, S Aiyar, V Arora, M Bhattacharyya… - … USENIX Symposium on …, 2017 - usenix.org
Modern cluster storage systems perform a variety of background tasks to improve the
performance, availability, durability, and cost-efficiency of stored data. For example, cleaners …

Recurrent reinforcement learning: a hybrid approach

X Li, L Li, J Gao, X He, J Chen, L Deng, J He - arXiv preprint arXiv …, 2015 - arxiv.org
Successful applications of reinforcement learning in real-world problems often require
dealing with partially observable states. It is in general very challenging to construct and …

Iroko: A framework to prototype reinforcement learning for data center traffic control

F Ruffy, M Przystupa, I Beschastnikh - arXiv preprint arXiv:1812.09975, 2018 - arxiv.org
Recent networking research has identified that data-driven congestion control (CC) can be
more efficient than traditional CC in TCP. Deep reinforcement learning (RL), in particular …

Arena: A general evaluation platform and building toolkit for multi-agent intelligence

Y Song, A Wojcicki, T Lukasiewicz, J Wang… - Proceedings of the AAAI …, 2020 - aaai.org
Learning agents that are not only capable of taking tests, but also innovating is becoming a
hot topic in AI. One of the most promising paths towards this vision is multi-agent learning …

Multiple-action computational model training and operation

J Chen, L Deng, J Gao, X He, L Li, J He… - US Patent …, 2021 - Google Patents
(57) ABSTRACT A processing unit can determine a first feature value corre sponding to a
session by operating a first network compu tational model (NCM) based part on information …

Multi-model controller

J Gao, L Deng, X He, P Singh, L Li, J Chen… - US Patent …, 2021 - Google Patents
A processing unit can operate a first recurrent computational model (RCM) to provide first
state information and a predicted result value. The processing unit can operating a first …

Application of Deep Reinforcement Learning Methods in Debt Collection

G Kuzmin, AI Panov, I Razvorotnev… - Proceedings of the Fifth …, 2022 - Springer
In the last few years, there is a growing interest in offline reinforcement learning (offline RL)
and in reinforcement learning (RL) in general. In this paper, we presented an example of …

Data-driven data center traffic control

F Ruffy Varga - 2019 - open.library.ubc.ca
Recent networking research has identified that data-driven congestion control (CC) can be
more efficient than traditional CC in data centers (DCs). Deep reinforcement learning (RL) …