作成者 |
|
|
|
|
|
|
本文言語 |
|
出版者 |
|
|
発行日 |
|
収録物名 |
|
巻 |
|
出版タイプ |
|
アクセス権 |
|
関連DOI |
|
|
関連URI |
|
|
関連情報 |
|
|
概要 |
In this paper, we consider a data mining problem for semi-structured data. Modeling semi-structured data as labeled ordered trees, we present an efficient algorithm for discovering frequent substructu...res from a large collection of semi-structured data. By extending the enumeration technique developed by Bayardo (SIGMOD'98) for discovering long itemsets, our algorithm scales almost linearly in the total size of maximal tree patterns contained in an input collection depending mildly on the size of the longest pattern. We also developed several pruning techniques that significantly speed-up the search. Experiments on Web data show that the our algorithm runs efficiently on real-life datasets combined with proposed pruning techniques in the wide range of parameters.続きを見る
|