A Template Discovery Algorithm by Substring Amplification - 九大コレクション

＜テクニカルレポート＞
A Template Discovery Algorithm by Substring Amplification

作成者	著者識別子 100021285 作成者名 Ikeda, Daisuke 池田, 大輔所属機関所属機関名 Computing and Communications Center, Kyushu University 九州大学情報基盤センター
	著者識別子 L002646 作成者名 Yamada, Yasuhiro 山田, 泰寛所属機関所属機関名 Department of Informatics, Kyushu University 九州大学大学院システム情報科学府
	著者識別子 K000008 作成者名 Hirokawa, Sachio 廣川, 佐千男所属機関所属機関名 Computing and Communications Center, Kyushu University 九州大学情報基盤センター
本文言語	英語
出版者	Department of Informatics, Kyushu University
出版者	九州大学大学院システム情報科学研究院情報理学部門
発行日	2003-12
収録物名	DOI Technical Report
巻	220
出版タイプ	Accepted Manuscript
アクセス権	open access
関連DOI	DOI Technical Report \|\| 220
	http://www.i.kyushu-u.ac.jp/research/report.html
	http://www.i.kyushu-u.ac.jp/index.html
関連URI	DOI Technical Report \|\| 220
	http://www.i.kyushu-u.ac.jp/research/report.html
	http://www.i.kyushu-u.ac.jp/index.html
関連情報	DOI Technical Report \|\| 220
	http://www.i.kyushu-u.ac.jp/research/report.html
	http://www.i.kyushu-u.ac.jp/index.html
概要	In this paper, we consider to find a set of substrings common to given strings. We define this problem as the template discovery problem which is, given a set of strings generated by some fixed but un...known pattern, to find the constant parts of the pattern. A pattern is a string over constant and variable symbols. It generates strings by replacing variables into constant strings.We assume that the frequency distribution of replaced strings follow a power-law distribution. Although the longest common subsequence problem, which is one of the famous common part discovery problems, is well-known to be NP-complete, we show that the template discovery problem can be solved in linear time with high probability. This complexity is achieved due to the following our contributions: reformulation of the problem, using a set of substrings to express a string, and counting all occurrences $ F( f ) $ with frequency $ f $ instead of just frequency $ f $. We demonstrate the effectiveness of the proposed algorithm using data on the Web. Moreover, we show noise robustness and effectiveness even when input strings are generated by a union of patterns and pattern with the iterate operation.続きを見る

本文ファイル

ファイル	ファイルタイプ	サイズ	閲覧回数	説明
trcs220	pdf	162 KB	387

詳細

レコードID	2815
査読有無	査読無
注記	13p
タイプ	テクニカルレポート
登録日	2009.04.22
更新日	2017.01.18

この情報を出力する

このページのリンク

他の検索サイト

利用統計

＜テクニカルレポート＞ A Template Discovery Algorithm by Substring Amplification

本文ファイル

詳細

この資料を見た人はこんな資料も見ています

＜テクニカルレポート＞
A Template Discovery Algorithm by Substring Amplification