期刊名称:Advances in Information Technology and Management
印刷版ISSN:2167-6372
出版年度:2012
卷号:1
期号:4
页码:166-169
语种:English
出版社:World Science Publisher
摘要:Given the non-convergence of function estimation and difficulty of reliability distribution in the process of reinforce learning, GASA-Q-learning was presented to solve the problems with markov process under uncertainty, in which genetic algorithm, simulated annealing algorithm and heuristic rule are proposed as a mean. In order to solve this problem, iterative process of value function estimation was transformed to continuous evolving process of reliability state space, and then Q-value function and distance function were used to adjust fitness value and energy function respectively. Finally the functions mentioned above were incorporated with probability distribution and reliability space to aid in finding the optimal policy. The pushing experiments show that the proposed methodology has strong robustness and fast convergence speed .
关键词:Q-learning;hybrid genetic algorithm;task planning under uncertainty;reinforce learning;belief state