Optimal management policies for water reservoir operation are generally designed via stochastic dynamic programming (SDP). Yet, the adoption of SDP in complex real-world …
Reinforcement learning (RL) allows agents to learn how to optimally interact with complex environments. Fueled by recent advances in approximation-based algorithms, RL has …
DP Bertsekas - Journal of Control Theory and Applications, 2011 - Springer
We consider the classical policy iteration method of dynamic programming (DP), where approximations and simulation are used to deal with the curse of dimensionality. We survey …
When people make a decision, they are usually hesitant and irresolute for one thing or another which makes it difficult to reach a final agreement. For example, two decision …
F Stulp, O Sigaud - arXiv preprint arXiv:1206.4621, 2012 - arxiv.org
There has been a recent focus in reinforcement learning on addressing continuous state and action problems by optimizing parameterized policies. PI2 is a recent example of this …
This study contributes a decision analytic framework to overcome policy inertia and myopia in complex river basin management contexts. The framework combines reservoir policy …
Emerging climate change mitigation policies focus on the implementation of global measures relying on carbon prices to attain rapid emissions reductions, with limited …
Globally, the pressures of expanding populations, climate change, and increased energy demands are motivating significant investments in re-operationalizing existing reservoirs or …
F Stulp, O Sigaud - Paladyn, Journal of Behavioral Robotics, 2013 - degruyter.com
Policy improvement methods seek to optimize the parameters of a policy with respect to a utility function. Owing to current trends involving searching in parameter space (rather than …