DocumentCode
3125664
Title
A New Markov Model for Clustering Categorical Sequences
Author
Xiong, Tengke ; Wang, Shengrui ; Jiang, Qingshan ; Huang, Joshua Zhexue
Author_Institution
Dept. of Comput. Sci., Univ. of Sherbrooke, Sherbrooke, QC, Canada
fYear
2011
fDate
11-14 Dec. 2011
Firstpage
854
Lastpage
863
Abstract
Clustering categorical sequences remains an open and challenging task due to the lack of an inherently meaningful measure of pair wise similarity between sequences. Model initialization is an unsolved problem in model-based clustering algorithms for categorical sequences. In this paper, we propose a simple and effective Markov model to approximate the conditional probability distribution (CPD) model, and use it to design a novel two-tier Markov model to represent a sequence cluster. Furthermore, we design a novel divisive hierarchical algorithm for clustering categorical sequences based on the two-tier Markov model. The experimental results on the data sets from three different domains demonstrate the promising performance of our models and clustering algorithm.
Keywords
Markov processes; pattern clustering; sequences; statistical distributions; categorical sequence clustering; conditional probability distribution model; divisive hierarchical algorithm; model based clustering algorithm; pairwise similarity; sequence cluster; two-tier Markov model; Algorithm design and analysis; Clustering algorithms; Data models; Hidden Markov models; Markov processes; Numerical models; Vectors; Markov model; categorical sequence; clustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2011 IEEE 11th International Conference on
Conference_Location
Vancouver,BC
ISSN
1550-4786
Print_ISBN
978-1-4577-2075-8
Type
conf
DOI
10.1109/ICDM.2011.13
Filename
6137290
Link To Document