<テクニカルレポート>
Efficient Substructure Discovery from Large Semi-structed Data

作成者
本文言語
出版者
発行日
収録物名
出版タイプ
アクセス権
関連DOI
関連URI
関連情報
概要 In this paper, we consider a data mining problem for semi-structured data. Modeling semi-structured data as labeled ordered trees, we present an efficient algorithm for discovering frequent substructu...res from a large collection of semi-structured data. By extending the enumeration technique developed by Bayardo (SIGMOD'98) for discovering long itemsets, our algorithm scales almost linearly in the total size of maximal tree patterns contained in an input collection depending mildly on the size of the longest pattern. We also developed several pruning techniques that significantly speed-up the search. Experiments on Web data show that the our algorithm runs efficiently on real-life datasets combined with proposed pruning techniques in the wide range of parameters.続きを見る

本文ファイル

pdf trcs200 pdf 493 KB 225  
gz trcs200.ps gz 474 KB 119  

詳細

レコードID
査読有無
タイプ
登録日 2009.04.22
更新日 2018.08.31

この資料を見た人はこんな資料も見ています