DocumentCode :
1804447
Title :
Text Mining using PrefixSpan constrained by Item Interval and Item Attribute
Author :
Sato, Issei ; Hirate, Yu ; Yamana, Hayato
Author_Institution :
Waseda University, Japan
fYear :
2006
fDate :
2006
Abstract :
Applying conventional sequential pattern mining methods to text data extracts many uninteresting patterns, which increases the time to interpret the extracted patterns. To solve this problem, we propose a new sequential pattern mining algorithm by adopting the following two constraints. One is to select sequences with regard to item intervals--the number of items between any two adjacent items in a sequence--and the other is to select sequences with regard to item attributes. Using Amazon customer reviews in the book category, we have confirmed that our method is able to extract patterns faster than the conventional method, and is better able to exclude uninteresting patterns while retaining the patterns of interest.
Keywords :
Conferences; Data engineering; Data mining; Databases; Electronic mail; Frequency; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering Workshops, 2006. Proceedings. 22nd International Conference on
Conference_Location :
Atlanta, GA, USA
Print_ISBN :
0-7695-2571-7
Type :
conf
DOI :
10.1109/ICDEW.2006.142
Filename :
1623913
Link To Document :
بازگشت