<会議発表論文>
Extraction of Tag Tree Patterns with Contractible Variables from Irregular Semistructured Data

作成者
本文言語
出版者
発行日
雑誌名
開始ページ
終了ページ
出版タイプ
アクセス権
概要 Information Extraction from semistructured data becomes more and more important. In order to extract meaningful or interesting contents from semistructured data, we need to extract common structured p...atterns from semistructured data. Many semistructured data have irregularities such as missing or erroneous data. A tag tree pattern is an edge labeled tree with ordered children which has tree structures of tags and structured variables. An edge label is a tag, a keyword or a wildcard, and a variable can be substituted by an arbitrary tree. Especially, a contractible variable matches any subtree including a singleton vertex. So a tag tree pattern is suited for representing common tree structured patterns in irregular semistructured data. We present a new method for extracting characteristic tag tree patterns from irregular semistructured data by using an algorithm for finding a least generalized tag tree pattern explaining given data. We report some experiments of applying this method to extracting characteristic tag tree patterns from irregular semistructured data.続きを見る

本文情報を非表示

2003_b_5 pdf 180 KB 71  

詳細

レコードID
査読有無
権利関係
関連情報
主題
ISSN
NCID
注記
タイプ
登録日 2012.02.28
更新日 2018.08.31