Reliability of internal prediction/estimation and its application. I. Adaptive action selection...

A Konar, IG Chakraborty, SJ Singh… - … on Systems, Man …, 2013 - ieeexplore.ieee.org

This paper provides a new deterministic Q-learning with a presumed knowledge about the
distance from the current state to both the next state and the goal. This knowledge is …

被引用次数：321 相关文章所有 8 个版本

[PDF] nii.ac.jp

Adaptive intermittent control: A computational model explaining motor intermittency observed in human behavior

Y Sakaguchi, M Tanaka, Y Inoue - Neural Networks, 2015 - Elsevier

It is a fundamental question how our brain performs a given motor task in a real-time fashion
with the slow sensorimotor system. Computational theory proposed an influential idea of …

被引用次数：39 相关文章所有 8 个版本

Artificial neural network model for predicting 5-year mortality after surgery for hepatocellular carcinoma: a nationwide study

HY Shi, KT Lee, JJ Wang, DP Sun, HH Lee… - Journal of gastrointestinal …, 2012 - Springer

Background To validate the use of artificial neural network (ANN) models for predicting 5-
year mortality in HCC and to compare their predictive capability with that of logistic …

被引用次数：41 相关文章所有 15 个版本

[PDF] ualberta.ca

[PDF][PDF] Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return.

C Sherstan, DR Ashley, B Bennett, K Young, A White… - UAI, 2018 - sites.ualberta.ca

Temporal-difference (TD) learning methods are widely used in reinforcement learning to
estimate the expected return for each state, without a model, because of their significant …

被引用次数：20 相关文章所有 4 个版本

[PDF] arxiv.org

Lifelong Reinforcement Learning via Neuromodulation

S Lee, S Liebana, C Clopath, W Dabney - arXiv preprint arXiv:2408.08446, 2024 - arxiv.org

Navigating multiple tasks $\unicode {x2014} $ for instance in succession as in continual or
lifelong learning, or in distributions as in meta or multi-task learning $\unicode {x2014} …

Quantum computation for action selection using reinforcement learning

CL Chen, DY Dong, ZH Chen - International Journal of Quantum …, 2006 - World Scientific

This paper proposes a novel action selection method based on quantum computation and
reinforcement learning (RL). Inspired by the advantages of quantum computation, the …

被引用次数：34 相关文章所有 5 个版本

[PDF] ualberta.ca

[PDF][PDF] Predictions, surprise, and predictions of surprise in general value function architectures

J Günther, A Kearney, MR Dawson… - AAAI 2018 Fall …, 2018 - sites.ualberta.ca

Effective life-long deployment of an autonomous agent in a complex environment demands
that the agent has some model of itself and its environment. Such models are inherently …

被引用次数：12 相关文章所有 3 个版本

[PDF] aaai.org

High-confidence off-policy (or counterfactual) variance estimation

Y Chandak, S Shankar, PS Thomas - Proceedings of the AAAI …, 2021 - ojs.aaai.org

Many sequential decision-making systems leverage data collected using prior policies to
propose a new policy. For critical applications, it is important that high-confidence …

被引用次数：7 相关文章所有 9 个版本

[PDF] mlr.press

Managing uncertainty within the KTD framework

M Geist, O Pietquin - Active Learning and Experimental …, 2011 - proceedings.mlr.press

The dilemma between exploration and exploitation is an important topic in reinforcement
learning (RL). Most successful approaches in addressing this problem tend to use some …

被引用次数：26 相关文章所有 13 个版本

[PDF] ualberta.ca

Representation and general value functions

C Sherstan - 2020 - era.library.ualberta.ca

Research in artificial general intelligence aims to create agents that can learn from their own
experience to solve arbitrary tasks in complex and dynamic settings. To do so effectively and …

被引用次数：8 相关文章所有 2 个版本

高级搜索

QQ 群