Tuning the discount factor in order to reach average optimality on deterministic MDPs

J Grand-Clément, M Petrik - Advances in Neural …, 2024 - proceedings.neurips.cc

We introduce the Blackwell discount factor for Markov Decision Processes (MDPs). Classical
objectives for MDPs include discounted, average, and Blackwell optimality. Many existing …

被引用次数：14 相关文章所有 6 个版本

[PDF] janvanhaaren.be

[PDF][PDF] Beyond action valuation: A deep reinforcement learning framework for optimizing player decisions in soccer

P Rahimian, J Van Haaren… - 16th MIT Sloan Sports …, 2022 - janvanhaaren.be

Soccer players need to make many decisions throughout a match in order to maximize their
team's chances of winning. Unfortunately, these decisions are challenging to measure and …

被引用次数：16 相关文章所有 3 个版本

[PDF] ieee.org

A Deep Reinforcement Learning Approach for Competitive Task Assignment in Enterprise Blockchain

G Volpe, AM Mangini, MP Fanti - IEEE Access, 2023 - ieeexplore.ieee.org

With the advent of Industry 4.0, the demand of high computing power for tasks such as data
mining, 3D rendering, file conversion and cryptography is continuously growing. To this …

被引用次数：4 相关文章所有 2 个版本

[PDF] sagepub.com

Towards maximizing expected possession outcome in soccer

P Rahimian, J Van Haaren… - International Journal of …, 2024 - journals.sagepub.com

Soccer players need to make many decisions throughout a match in order to maximize their
team's chances of winning. Unfortunately, these decisions are challenging to measure and …

被引用次数：6 相关文章所有 7 个版本

[PDF] wiley.com Full View

A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q‐Learning

H Kayakoku, MS Guzel, E Bostanci, IT Medeni… - …, 2021 - Wiley Online Library

This paper addresses a new machine learning‐based behavioral strategy using the deep Q‐
learning algorithm for the RoboCode simulation platform. According to this strategy, a new …

被引用次数：6 相关文章所有 12 个版本

[HTML] sciencedirect.com

[HTML][HTML] Optimal investment strategy on data analytics capabilities of startups via Markov decision analysis

M Voorneveld, M de Groot - Decision Analytics Journal, 2024 - Elsevier

Startup companies operate in an unpredictable and unstable business environment. They
have the potential to grow through optimal decision-making► Or a suboptimal decision …

被引用次数：5 相关文章

[PDF] carleton.ca

Multi-Agent Deep Reinforcement Learning Assisted Pre-connect Handover Management

Y Wei - 2022 - repository.library.carleton.ca

This thesis proposes a MBB adopted handover mechanism, namely, pre-connect handover
(PHO). PHO aims to provide a seamless and reliable handover for 5G networks. PHO …

[PDF][PDF] Discounting in Markov Chain Estimation

E Che, J Dong - ethche.github.io

Discounting can be viewed as a perturbation to improve the ergodicity of the Markov chain
by imposing more regular regenerations. It can improve the estimation efficiency in Markov …

[PDF] academia.edu

[PDF][PDF] Research Article A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning

H Kayakoku, MS Guzel, E Bostanci, IT Medeni… - 2021 - academia.edu

Research Article A Novel Behavioral Strategy for RoboCode Platform Based on Deep Q-Learning
Page 1 Research Article A Novel Behavioral Strategy for RoboCode Platform Based on Deep …

[PDF][PDF] Bachelor's Thesis Assignment

X Li, Y Li, Y Zhan, XY Liu - 2020 - theses.cz

Since China's entry into the World Trade Organization at the end of 2001, China's luxury
consumer market has developed rapidly, and international luxury brands have entered the …

高级搜索

QQ 群