Title :
Extracting frequent subsequences from a single long data sequence a novel anti-monotonic measure and a simple on-line algorithm
Author :
Iwanuma, Koji ; Ishihara, Ryuichi ; Takano, Yo ; Nabeshima, Hidetomo
Author_Institution :
Dept. of Comput. Sci. & Media Eng., Yamanashi Univ., Kofu, Japan
Abstract :
In this paper, we study frequent subsequence extraction from a single very-long data-sequence. First we propose a novel frequency measure, called the total frequency, for counting multiple occurrences of a sequential pattern in a single data sequence. The total frequency is anti-monotonic, and makes it possible to count up pattern occurrences without duplication. Moreover the total frequency has a good property for implementation based on the dynamic programming strategy. Second we give a simple on-line algorithm for a specialized subsequence extraction problem, i.e., a problem with the infinite window-length. This specialized problem is considered to be a relaxation of the general-case problem, thus this fast on-line algorithm is important from the view of practical applications.
Keywords :
data mining; antimonotonic measure; frequent subsequence extraction; online algorithm; total frequency measure; Computer science; Data engineering; Data mining; Databases; Dynamic programming; Frequency measurement; Itemsets;
Conference_Titel :
Data Mining, Fifth IEEE International Conference on
Print_ISBN :
0-7695-2278-5
DOI :
10.1109/ICDM.2005.60