作者
Hamid R Tizhoosh
发表日期
2005/12/19
期刊
International conference on artificial intelligence and machine learning
卷号
414
简介
Reinforcement learning is a machine intelligence scheme for learning in highly dynamic and probabilistic environments. The methodology, however, suffers from a major drawback; the convergence to an optimal solution usually requires high computational expense since all states should be visited frequently in order to guarantee a reliable policy. In this paper, a new reinforcement learning algorithm is introduced to achieve a faster convergence by taking into account the opposite actions. By considering the opposite actions simultaneously multiple updates can be made for each state observation. This leads to a shorter exploration period and, hence, expedites the convergence. Experimental results for the grid world problem of different sizes are provided to verify the performance of the proposed approach.
引用总数
20062007200820092010201120122013201420152016201720182019202020212022202320246111095578121216107995984
学术搜索中的文章
HR Tizhoosh - International conference on artificial intelligence and …, 2005