DocumentCode :
2949395
Title :
Sequential pattern mining using personalized minimum support threshold with minimum items
Author :
Alias, Suraya ; Razali, Mohd Norhisham ; Fun, Tan Soo ; Sainin, Mohd Shamrie
Author_Institution :
Sch. of Eng. & Inf. Technol., Univ. Malaysia Sabah, Kota Kinabalu, Malaysia
fYear :
2011
fDate :
23-24 Nov. 2011
Firstpage :
1
Lastpage :
6
Abstract :
One of the challenges of Sequential Pattern Mining is finding frequent sequential patterns in a huge click stream data (web logs) since the data has the issue of a very low support distribution. By applying a Frequent Pattern Discovery technique, a sequence is considered as frequent if it occurs more than the minimum support (min sup) threshold value. The conventional method of assuming one min sup value is valid for all levels of k-sequence, may have an impact on the overall results or pattern generation. In this paper, a personalized minimum support (P_minsup) threshold with user specified minimum items or min_i is introduced. The P_minsup is generated for each k-sequence by analyzing the overall support pattern distribution of the click stream data; while the min_i value gives the user the flexibility to gain control on the number of patterns to be generated on the next k-sequence by using the top min_i items. This approach is then applied in the SPADE Algorithm using vector array as an extension from the previous method of using relational database and pre-defined threshold. The result from this experiment demonstrates that P_minsup with the complement of min_i value approach is applicable in assisting the process of determining the suitable threshold value to be used in detecting users´ frequent k-sequential topics in navigating the World Wide Web (WWW).
Keywords :
Internet; data mining; SPADE algorithm; World Wide Web; frequent pattern discovery technique; huge click stream data; min_i value; personalized minimum support threshold; relational database; sequential pattern mining; Arrays; Business; Data mining; Itemsets; Lattices; Navigation; Vectors; Sequential Pattern; Web Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Research and Innovation in Information Systems (ICRIIS), 2011 International Conference on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-61284-295-0
Type :
conf
DOI :
10.1109/ICRIIS.2011.6125688
Filename :
6125688
Link To Document :
بازگشت