Finding Repetitive Patterns Using FFT - Collections

＜conference paper＞
Finding Repetitive Patterns Using FFT

Creator	Author PID K000012 Creator Name 中藤, 哲也 Nakatoh, Tetsuya Affiliation Affiliation Name 九州大学情報基盤センター Computing and Communications Center, Kyushu University
Creator	Author PID K000008 Creator Name 廣川, 佐千男 Hirokawa, Sachio Affiliation Affiliation Name 九州大学情報基盤センター Computing and Communications Center, Kyushu University
Language	Japanese
Publisher	情報処理学会
Date	2003-07
Source Title	情報処理学会研究報告 : データベースシステム
Vol	2003
Issue	71
First Page	311
Last Page	318
Publication Type	Accepted Manuscript
Access Rights	open access
Rights	ここに掲載した著作物の利用に関する注意本著作物の著作権は（社）情報処理学会に帰属します。本著作物は著作権者である情報処理学会の許可のもとに掲載するものです。ご利用に当たっては「著作権法」ならびに「情報処理学会倫理綱領」に従うことをお願いいたします。
Related DOI	情報処理学会研究報告 : データベースシステム \|\| 2003(71) \|\| p311-318
Related DOI	http://matu.cc.kyushu-u.ac.jp/
Related URI	情報処理学会研究報告 : データベースシステム \|\| 2003(71) \|\| p311-318
Related URI	http://matu.cc.kyushu-u.ac.jp/
Relation	情報処理学会研究報告 : データベースシステム \|\| 2003(71) \|\| p311-318
Relation	http://matu.cc.kyushu-u.ac.jp/
Abstract	半構造テキスト中から自明でない情報を取り出す技術である，データマイニング，あるいはテキストマイニングは，拡大するWWW上の情報を取り扱う上で非常に重要である．その技術の一つとして，対象のデータに繰り返し出現するパターンを発見する問題がある．発見されたパターンを用いることで，そのデータを加工する，あるいはデータから新たな情報を抽出する事が可能となる．繰り返しパターンを発見する方法として，対象となるデ...ータをそれ自身のコピーと位置をずらして重ね，一致部分を見つける素朴な方法が考えられる．しかしこの方法は，テキストサイズnに対して計算量がO(n2)となり，大きなデータに対しては現実的ではない．本研究では，我々が提唱しているFFTを用いた効率的な近似文字列照合アルゴリズムを適用し，O(nlogn)の計算量で繰り返しパターンを発見する手法について提案する． Data-Mining or Text-Mining, that is technique to extract non-obvious information from semi-structured texts, has been very important technologies when we handle expanding information in WWW. One of them is to discover patterns that appear in the data repetitively. Using the patterns, we can process the data and can extract from the data. To discover them, we can think about the naive method, i.e. the method of aligning data with that own shifted copy data, and compare them. However, when the size of the text is n, time complexity of this method becomes O(n^2), and it isn't efficient for big data. In this paper, we propose the technique to reduce time complexity of the method to O(n log n) using our string matching algorithm with mismatches.show more

Hide fulltext details.

File	FileType	Size	Views	Description
2003_b_3	pdf	742 KB	434

Details

Record ID	2964
Peer-Reviewed	Unrefereed
Subject Terms	繰り返しパターン発見
	マイニング
	半構造データ
	近似文字列照合
	検索エンジン
	FFT
	Finding Repetitive Patterns
	Mining
	Semi-structured Text
	String Matching with Misamatches
	Search Engine
	FFT
	パターン発見と抽出
Notes	情報処理学会研究会報告(DBWS2003),2003.07
Type	会議発表論文
Created Date	2009.04.22
Modified Date	2017.01.19

Export

Link to this page

Search Other Services

Statistics

＜conference paper＞ Finding Repetitive Patterns Using FFT

Hide fulltext details.

Details

People who viewed this item also viewed

＜conference paper＞
Finding Repetitive Patterns Using FFT