Optimized Substructure Discovery for Semi-structured Data - 九大コレクション

＜テクニカルレポート＞
Optimized Substructure Discovery for Semi-structured Data

作成者	作成者名 Abe, Kenji 安部, 賢治所属機関所属機関名 Department of Informatics, Kyushu University 九州大学大学院システム情報科学研究院情報理学部門
	作成者名 Kawasoe, Shinji 川副, 真治所属機関所属機関名 Department of Informatics, Kyushu University 九州大学大学院システム情報科学研究院情報理学部門
	作成者名 Asai, Tatsuya 浅井, 達哉所属機関所属機関名 Department of Informatics, Kyushu University 九州大学大学院システム情報科学研究院情報理学部門
	作成者名 Arimura, Hiroki 有村, 博紀所属機関所属機関名 PRESTO, JST \| Department of Informatics, Kyushu University 独立行政法人科学技術振興機構 \| 九州大学大学院システム情報科学研究院情報理学部門
	作成者名 Arikawa, Setsuo 有川, 節夫所属機関所属機関名 Department of Informatics, Kyushu University 九州大学大学院システム情報科学研究院情報理学部門
本文言語	英語
出版者	Department of Informatics, Kyushu University
出版者	九州大学大学院システム情報科学研究院情報理学部門
発行日	2002-03
収録物名	DOI Technical Report
巻	206
出版タイプ	Accepted Manuscript
アクセス権	open access
関連DOI	DOI Technical Report \|\| 206
関連DOI	http://www.i.kyushu-u.ac.jp/research/report.html
関連URI	DOI Technical Report \|\| 206
関連URI	http://www.i.kyushu-u.ac.jp/research/report.html
関連情報	DOI Technical Report \|\| 206
関連情報	http://www.i.kyushu-u.ac.jp/research/report.html
概要	We address the problem of finding interesting substructures from a colletion of semi-structured data such as XML or HTML. Our framework of data mining is optimized pattern discovery introduced by Fuku...da et al., where the goal of a mining algorithm is to discover a pattern that optimizes a given statistical measure such as the information entropy over a class of simple patterns. In this paper, modeling semi-structured data with labeled ordered trees, we study the efficient algorithm for the optimized pattern discovery problem for the class. In a previous paper, we developed the rightmost expansion technique and the incremental occurrence update technique by generalizing enumeration technique developed by Bayardo (SIGMOD'98) for discovering long itemsets to implement an efficient frequent pattern miner for the class of labeled ordered trees. By combining these technique with the pruning technique for optimized patterns of Morishita and Sese (PODS'00), we present an efficient algorithm for finding optimized patterns for labeled ordered trees of bounded size. Experimental results show that our algorithm perform well on a variety of size of data and range of parameters. We also show an approximation hardness result for labeled ordered trees of unbounded size.続きを見る

本文ファイル

ファイル	ファイルタイプ	サイズ	閲覧回数	説明
trcs206	pdf	930 KB	388
trcs206.ps	gz	0.98 MB	181

詳細

レコードID	3050
査読有無	査読無
タイプ	テクニカルレポート
登録日	2009.04.22
更新日	2018.08.31

この情報を出力する

このページのリンク

他の検索サイト

利用統計

＜テクニカルレポート＞ Optimized Substructure Discovery for Semi-structured Data

本文ファイル

詳細

この資料を見た人はこんな資料も見ています

＜テクニカルレポート＞
Optimized Substructure Discovery for Semi-structured Data