DocumentCode
1910773
Title
Frequent Approximate Substring Pattern Mining Using Locally Optimal Occurrence Counting
Author
Nakamura, Atsuyoshi ; Tosaka, Hisashi ; Kudo, Mineichi
Author_Institution
Hokkaido Univ., Sapporo, Japan
fYear
2012
fDate
20-22 Sept. 2012
Firstpage
54
Lastpage
59
Abstract
We propose a novel frequent approximate pattern mining in which occurrences themselves are valuable regions to extract. Given a string s, our mining task is to enumerate its sub strings that locally optimally match many sub strings of s. We show an algorithm for this problem whose time and space complexities are O(n3) for a string with length n. According to our experiments using synthetic data, our algorithm can correctly extract embedded frequent patterns as both patterns and their occurrences even if the embedded patterns are corrupted by random edit operations.
Keywords
computational complexity; data mining; embedded frequent pattern extraction; frequent approximate substring pattern mining; locally optimal occurrence counting; mining task; random edit operation; space complexity; synthetic data; time complexity; Approximation algorithms; Complexity theory; Data mining; Electronic mail; Hamming distance; Optimized production technology; Pattern matching; data mining; frequent pattern; local alignment; string algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Applied Informatics (IIAIAAI), 2012 IIAI International Conference on
Conference_Location
Fukuoka
Print_ISBN
978-1-4673-2719-0
Type
conf
DOI
10.1109/IIAI-AAI.2012.20
Filename
6337157
Link To Document