文章基本信息

标题：About the Reinforcement Function for Profit Sharing
本地全文：下载
作者：Wataru Uemura ; Shoji Tatsumi
期刊名称：人工知能学会論文誌
印刷版ISSN：1346-0714
电子版ISSN：1346-8030
出版年度：2004
卷号：19
期号：4
页码：197-203
DOI：10.1527/tjsai.19.197
出版社：The Japanese Society for Artificial Intelligence
摘要：In this paper, we consider profit sharing that is one of the reinforcement learning methods. An agent learns a candidate solution of a problem from the reward that is received from the environment if and only if it reaches the destination state. A function that distributes the received reward to each action of the candidate solution is called the reinforcement function. On this learning system, the agent can reinforce the set of selected actions when it gets the reward. And the agent should not reinforce the detour actions. First, we will propose a new constraint equation about reinforcement functions to distribute the reinforcement values on the non-detour actions. If we use the reinforcement function to satisfy the constraint equation, the agent can select the non-detour actions directing to the destination state. Next, it is shown that the reinforcement function can be constant after learning process to suppress the selection of detour actions. Lastly, in computer simulations for maze problems, we show that the learning performance of agents does not depend on the size of environment.
关键词：reinforcement learning ; profit sharing ; rationality theory ; reinforcement function ; detour rule