Reducing blackwell and average optimality to discounted mdps via the blackwell discount factor

J Grand-Clément, M Petrik - Advances in Neural …, 2024 - proceedings.neurips.cc
We introduce the Blackwell discount factor for Markov Decision Processes (MDPs). Classical
objectives for MDPs include discounted, average, and Blackwell optimality. Many existing …

[PDF][PDF] Beyond action valuation: A deep reinforcement learning framework for optimizing player decisions in soccer

P Rahimian, J Van Haaren… - 16th MIT Sloan Sports …, 2022 - janvanhaaren.be
Soccer players need to make many decisions throughout a match in order to maximize their
team's chances of winning. Unfortunately, these decisions are challenging to measure and …

A Deep Reinforcement Learning Approach for Competitive Task Assignment in Enterprise Blockchain

G Volpe, AM Mangini, MP Fanti - IEEE Access, 2023 - ieeexplore.ieee.org
With the advent of Industry 4.0, the demand of high computing power for tasks such as data
mining, 3D rendering, file conversion and cryptography is continuously growing. To this …

Towards maximizing expected possession outcome in soccer

P Rahimian, J Van Haaren… - International Journal of …, 2024 - journals.sagepub.com
Soccer players need to make many decisions throughout a match in order to maximize their
team's chances of winning. Unfortunately, these decisions are challenging to measure and …

A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q‐Learning

H Kayakoku, MS Guzel, E Bostanci, IT Medeni… - …, 2021 - Wiley Online Library
This paper addresses a new machine learning‐based behavioral strategy using the deep Q‐
learning algorithm for the RoboCode simulation platform. According to this strategy, a new …

[HTML][HTML] Optimal investment strategy on data analytics capabilities of startups via Markov decision analysis

M Voorneveld, M de Groot - Decision Analytics Journal, 2024 - Elsevier
Startup companies operate in an unpredictable and unstable business environment. They​
have the potential to grow through optimal decision-making► Or a suboptimal decision …

Multi-Agent Deep Reinforcement Learning Assisted Pre-connect Handover Management

Y Wei - 2022 - repository.library.carleton.ca
This thesis proposes a MBB adopted handover mechanism, namely, pre-connect handover
(PHO). PHO aims to provide a seamless and reliable handover for 5G networks. PHO …

[PDF][PDF] Discounting in Markov Chain Estimation

E Che, J Dong - ethche.github.io
Discounting can be viewed as a perturbation to improve the ergodicity of the Markov chain
by imposing more regular regenerations. It can improve the estimation efficiency in Markov …

[PDF][PDF] Research Article A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning

H Kayakoku, MS Guzel, E Bostanci, IT Medeni… - 2021 - academia.edu
Research Article A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning
Page 1 Research Article A Novel Behavioral Strategy for RoboCode Platform Based on Deep …

[PDF][PDF] Bachelor's Thesis Assignment

X Li, Y Li, Y Zhan, XY Liu - 2020 - theses.cz
Since China's entry into the World Trade Organization at the end of 2001, China's luxury
consumer market has developed rapidly, and international luxury brands have entered the …