DocumentCode :
3125664
Title :
A New Markov Model for Clustering Categorical Sequences
Author :
Xiong, Tengke ; Wang, Shengrui ; Jiang, Qingshan ; Huang, Joshua Zhexue
Author_Institution :
Dept. of Comput. Sci., Univ. of Sherbrooke, Sherbrooke, QC, Canada
fYear :
2011
fDate :
11-14 Dec. 2011
Firstpage :
854
Lastpage :
863
Abstract :
Clustering categorical sequences remains an open and challenging task due to the lack of an inherently meaningful measure of pair wise similarity between sequences. Model initialization is an unsolved problem in model-based clustering algorithms for categorical sequences. In this paper, we propose a simple and effective Markov model to approximate the conditional probability distribution (CPD) model, and use it to design a novel two-tier Markov model to represent a sequence cluster. Furthermore, we design a novel divisive hierarchical algorithm for clustering categorical sequences based on the two-tier Markov model. The experimental results on the data sets from three different domains demonstrate the promising performance of our models and clustering algorithm.
Keywords :
Markov processes; pattern clustering; sequences; statistical distributions; categorical sequence clustering; conditional probability distribution model; divisive hierarchical algorithm; model based clustering algorithm; pairwise similarity; sequence cluster; two-tier Markov model; Algorithm design and analysis; Clustering algorithms; Data models; Hidden Markov models; Markov processes; Numerical models; Vectors; Markov model; categorical sequence; clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location :
Vancouver,BC
ISSN :
1550-4786
Print_ISBN :
978-1-4577-2075-8
Type :
conf
DOI :
10.1109/ICDM.2011.13
Filename :
6137290
Link To Document :
بازگشت