A LEARNING ALGORITHM FOR COMMUNICATING MARKOV DECISION PROCESSES WITH UNKNOWN TRANSITION MATRICES - 九大コレクション

＜学術雑誌論文＞
A LEARNING ALGORITHM FOR COMMUNICATING MARKOV DECISION PROCESSES WITH UNKNOWN TRANSITION MATRICES

作成者	作成者名 Iki, Tetsuichiro 伊喜, 哲一郎所属機関所属機関名 Faculty of Education and Culture, Miyazaki University 宮崎大学教育文化学部
	作成者名 Horiguchi, Masayuki 所属機関所属機関名 General Education, Yuge National College of Maritime Technology
	作成者名 Yasuda, Masami 安田, 正實所属機関所属機関名 Faculty of Science, Chiba University 千葉大学理学部
	作成者名 Kurano, Masami 蔵野, 正美所属機関所属機関名 Faculty of Education, Chiba University 千葉大学教育学部
本文言語	英語
出版者	Research Association of Statistical Sciences
出版者	統計科学研究会
発行日	2007-12
収録物名	Bulletin of informatics and cybernetics
巻	39
開始ページ	11
終了ページ	24
出版タイプ	Version of Record
アクセス権	open access
Crossref DOI	https://doi.org/10.5109/16771
関連DOI	Bulletin of informatics and cybernetics \|\| 39 \|\| p11-24
関連DOI	http://bic.math.kyushu-u.ac.jp/
関連URI	Bulletin of informatics and cybernetics \|\| 39 \|\| p11-24
関連URI	http://bic.math.kyushu-u.ac.jp/
関連情報	Bulletin of informatics and cybernetics \|\| 39 \|\| p11-24
関連情報	http://bic.math.kyushu-u.ac.jp/
概要	This study is concerned with finite Markov decision processes (MDPs) whose state are exactly observable but its transition matrix is unknown. We develop a learning algorithm of the reward-penalty type... for the communicating case of multi-chain MDPs. An adaptively optimal policy and an asymptotic sequence of adaptive policies with nearly optimal properties are constructed under the average expected reward criterion. Also, a numerical experiment is given to show the practical effectiveness of the algorithm.続きを見る

本文ファイル

ファイル	ファイルタイプ	サイズ	閲覧回数	説明
bic039_p011	pdf	160 KB	345

詳細

PISSN	0286-522X
EISSN	2435-743X
NCID	AA10634475
レコードID	16771
査読有無	査読有
主題	Adaptive policy
	Average case
	Communicating case
	Learning algorithm
	Markov decision processes
	Reward-penalty type
	Unknown transition matrix
タイプ	学術雑誌論文
登録日	2010.03.11
更新日	2020.11.02

この情報を出力する

このページのリンク

他の検索サイト

利用統計

＜学術雑誌論文＞ A LEARNING ALGORITHM FOR COMMUNICATING MARKOV DECISION PROCESSES WITH UNKNOWN TRANSITION MATRICES

本文ファイル

詳細

この資料を見た人はこんな資料も見ています

＜学術雑誌論文＞
A LEARNING ALGORITHM FOR COMMUNICATING MARKOV DECISION PROCESSES WITH UNKNOWN TRANSITION MATRICES