A brief review of portfolio optimization techniques

A Gunjan, S Bhattacharyya - Artificial Intelligence Review, 2023 - Springer
Portfolio optimization has always been a challenging proposition in finance and
management. Portfolio optimization facilitates in selection of portfolios in a volatile market …

Conservative q-learning for offline reinforcement learning

A Kumar, A Zhou, G Tucker… - Advances in Neural …, 2020 - proceedings.neurips.cc
Effectively leveraging large, previously collected datasets in reinforcement learn-ing (RL) is
a key challenge for large-scale real-world applications. Offline RL algorithms promise to …

Contrastive learning as goal-conditioned reinforcement learning

B Eysenbach, T Zhang, S Levine… - Advances in Neural …, 2022 - proceedings.neurips.cc
In reinforcement learning (RL), it is easier to solve a task if given a good representation.
While deep RL should automatically acquire such good representations, prior work often …

A survey and critique of multiagent deep reinforcement learning

P Hernandez-Leal, B Kartal, ME Taylor - Autonomous Agents and Multi …, 2019 - Springer
Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has
led to a dramatic increase in the number of applications and methods. Recent works have …

Tensor programs ii: Neural tangent kernel for any architecture

G Yang - arXiv preprint arXiv:2006.14548, 2020 - arxiv.org
We prove that a randomly initialized neural network of* any architecture* has its Tangent
Kernel (NTK) converge to a deterministic limit, as the network widths tend to infinity. We …

Softmax deep double deterministic policy gradients

L Pan, Q Cai, L Huang - Advances in neural information …, 2020 - proceedings.neurips.cc
A widely-used actor-critic reinforcement learning algorithm for continuous control, Deep
Deterministic Policy Gradients (DDPG), suffers from the overestimation problem, which can …

Iris: Implicit reinforcement without interaction at scale for learning control from offline robot manipulation data

A Mandlekar, F Ramos, B Boots… - … on Robotics and …, 2020 - ieeexplore.ieee.org
Learning from offline task demonstrations is a problem of great interest in robotics. For
simple short-horizon manipulation tasks with modest variation in task instances, offline …

Discor: Corrective feedback in reinforcement learning via distribution correction

A Kumar, A Gupta, S Levine - Advances in Neural …, 2020 - proceedings.neurips.cc
Deep reinforcement learning can learn effective policies for a wide range of tasks, but is
notoriously difficult to use due to instability and sensitivity to hyperparameters. The reasons …

Introduction to reinforcement learning

Z Ding, Y Huang, H Yuan, H Dong - Deep reinforcement learning …, 2020 - Springer
In this chapter, we introduce the fundamentals of classical reinforcement learning and
provide a general overview of deep reinforcement learning. We first start with the basic …

Finite-time analysis of whittle index based Q-learning for restless multi-armed bandits with neural network function approximation

G Xiong, J Li - Advances in Neural Information Processing …, 2023 - proceedings.neurips.cc
Whittle index policy is a heuristic to the intractable restless multi-armed bandits (RMAB)
problem. Although it is provably asymptotically optimal, finding Whittle indices remains …