Policy Learning Using Modified Learning Vector Quantization for Reinforcement Learning Problems - 九大コレクション | 九州大学附属図書館

検索結果一覧に戻る

＜紀要論文＞
Policy Learning Using Modified Learning Vector Quantization for Reinforcement Learning Problems

作成者	作成者名 Afif Mohd Faudzi, Ahmad 所属機関所属機関名 Department of Electrical and Electronic Engineering, Graduate School of Information and Electrical Engineering, Kyushu University \| Department of Electrical and Electronic Engineering, Universiti Malaysia
作成者	著者識別子 K000238 作成者名村田, 純一 Murata, Junichi ムラタ, ジュンイチ所属機関所属機関名九州大学大学院システム情報科学研究院電気システム工学 : 教授 Department of Electrical Engineering, Faculty of Information Science and Electrical Engineering, Kyushu University : Professor
本文言語	英語
出版者	九州大学大学院システム情報科学研究院
出版者	Faculty of Information Science and Electrical Engineering, Kyushu University
発行日	2015-07-24
収録物名	九州大学大学院システム情報科学紀要
巻	20
号	2
開始ページ	39
終了ページ	44
出版タイプ	Version of Record
アクセス権	open access
JaLC DOI	https://doi.org/10.15017/1560523
概要	Reinforcement learning (RL) enables an agent to _nd an optimal solution to a problem by interacting with the environment. In the previous research, Q-learning, one of the popular learning meth-ods in ...RL, is used to generate a policy. From it, abstract policy is extracted by LVQ algorithm. In this paper, the aim is to train the agent to learn an optimal policy from scratch as well as to generate the abstract policy in a single operation by LVQ algorithm. When applying LVQ algorithm in a RL frame-work, due to an erroneous teaching signal in LVQ algorithm, the learning sometimes end up with failure or with non-optimal solution. Here, a new LVQ algorithm is proposed to overcome this problem. The new LVQ algorithm introduce, _rst, a regular reward that is obtained by the agent autonomously based on its behavior and second, a function that convert a regular reward to a new reward so that the learning system does not su_er from an undesirable e_ect by a small reward. Through these modi_cations, the agent is expected to _nd the optimal solution more e_ciently.続きを見る

本文ファイル

ファイル	ファイルタイプ	サイズ	閲覧回数	説明
p039	pdf	439 KB	264

詳細

PISSN	1342-3819
EISSN	2188-0891
NCID	AN10569524
レコードID	1560523
査読有無	査読有
主題	Policy learning
	Learning Vector Quantization
	Reinforcement learning and Abstraction
登録日	2016.02.03
更新日	2020.10.12