A deterministic improved Q-learning for path planning of a mobile robot

A Konar, IG Chakraborty, SJ Singh… - … on Systems, Man …, 2013 - ieeexplore.ieee.org
This paper provides a new deterministic Q-learning with a presumed knowledge about the
distance from the current state to both the next state and the goal. This knowledge is …

Adaptive intermittent control: A computational model explaining motor intermittency observed in human behavior

Y Sakaguchi, M Tanaka, Y Inoue - Neural Networks, 2015 - Elsevier
It is a fundamental question how our brain performs a given motor task in a real-time fashion
with the slow sensorimotor system. Computational theory proposed an influential idea of …

Artificial neural network model for predicting 5-year mortality after surgery for hepatocellular carcinoma: a nationwide study

HY Shi, KT Lee, JJ Wang, DP Sun, HH Lee… - Journal of gastrointestinal …, 2012 - Springer
Background To validate the use of artificial neural network (ANN) models for predicting 5-
year mortality in HCC and to compare their predictive capability with that of logistic …

[PDF][PDF] Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return.

C Sherstan, DR Ashley, B Bennett, K Young, A White… - UAI, 2018 - sites.ualberta.ca
Temporal-difference (TD) learning methods are widely used in reinforcement learning to
estimate the expected return for each state, without a model, because of their significant …

Lifelong Reinforcement Learning via Neuromodulation

S Lee, S Liebana, C Clopath, W Dabney - arXiv preprint arXiv:2408.08446, 2024 - arxiv.org
Navigating multiple tasks $\unicode {x2014} $ for instance in succession as in continual or
lifelong learning, or in distributions as in meta or multi-task learning $\unicode {x2014} …

Quantum computation for action selection using reinforcement learning

CL Chen, DY Dong, ZH Chen - International Journal of Quantum …, 2006 - World Scientific
This paper proposes a novel action selection method based on quantum computation and
reinforcement learning (RL). Inspired by the advantages of quantum computation, the …

[PDF][PDF] Predictions, surprise, and predictions of surprise in general value function architectures

J Günther, A Kearney, MR Dawson… - AAAI 2018 Fall …, 2018 - sites.ualberta.ca
Effective life-long deployment of an autonomous agent in a complex environment demands
that the agent has some model of itself and its environment. Such models are inherently …

High-confidence off-policy (or counterfactual) variance estimation

Y Chandak, S Shankar, PS Thomas - Proceedings of the AAAI …, 2021 - ojs.aaai.org
Many sequential decision-making systems leverage data collected using prior policies to
propose a new policy. For critical applications, it is important that high-confidence …

Managing uncertainty within the KTD framework

M Geist, O Pietquin - Active Learning and Experimental …, 2011 - proceedings.mlr.press
The dilemma between exploration and exploitation is an important topic in reinforcement
learning (RL). Most successful approaches in addressing this problem tend to use some …

Representation and general value functions

C Sherstan - 2020 - era.library.ualberta.ca
Research in artificial general intelligence aims to create agents that can learn from their own
experience to solve arbitrary tasks in complex and dynamic settings. To do so effectively and …