DocumentCode :
679558
Title :
Efficiently Mining Top-K High Utility Sequential Patterns
Author :
Junfu Yin ; Zhigang Zheng ; Longbing Cao ; Yin Song ; Wei Wei
Author_Institution :
Adv. Analytics Inst., Univ. of Technol., Sydney, NSW, Australia
fYear :
2013
fDate :
7-10 Dec. 2013
Firstpage :
1259
Lastpage :
1264
Abstract :
High utility sequential pattern mining is an emerging topic in the data mining community. Compared to the classic frequent sequence mining, the utility framework provides more informative and actionable knowledge since the utility of a sequence indicates business value and impact. However, the introduction of "utility" makes the problem fundamentally different from the frequency-based pattern mining framework and brings about dramatic challenges. Although the existing high utility sequential pattern mining algorithms can discover all the patterns satisfying a given minimum utility, it is often difficult for users to set a proper minimum utility. A too small value may produce thousands of patterns, whereas a too big one may lead to no findings. In this paper, we propose a novel framework called top-k high utility sequential pattern mining to tackle this critical problem. Accordingly, an efficient algorithm, Top-k high Utility Sequence (TUS for short) mining, is designed to identify top-k high utility sequential patterns without minimum utility. In addition, three effective features are introduced to handle the efficiency problem, including two strategies for raising the threshold and one pruning for filtering unpromising items. Our experiments are conducted on both synthetic and real datasets. The results show that TUS incorporating the efficiency-enhanced strategies demonstrates impressive performance without missing any high utility sequential patterns.
Keywords :
business data processing; data mining; information filtering; utility theory; TUS mining; business value; data mining community; real datasets; synthetic datasets; top-k high utility sequential pattern mining; unpromising item filtering; Algorithm design and analysis; Business; Data mining; Itemsets; Sequences; Sorting; High utility sequential pattern mining; Top-K sequential pattern mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
ISSN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2013.148
Filename :
6729631
Link To Document :
بازگشت