作者
Hamid R Tizhoosh
发表日期
2005/12/19
期刊
International conference on artificial intelligence and machine learning
卷号
414
简介
Reinforcement learning is a machine intelligence scheme for learning in highly dynamic and probabilistic environments. The methodology, however, suffers from a major drawback; the convergence to an optimal solution usually requires high computational expense since all states should be visited frequently in order to guarantee a reliable policy. In this paper, a new reinforcement learning algorithm is introduced to achieve a faster convergence by taking into account the opposite actions. By considering the opposite actions simultaneously multiple updates can be made for each state observation. This leads to a shorter exploration period and, hence, expedites the convergence. Experimental results for the grid world problem of different sizes are provided to verify the performance of the proposed approach.
引用总数
学术搜索中的文章
HR Tizhoosh - International conference on artificial intelligence and …, 2005