DocumentCode :
2772620
Title :
A Contrast Pattern Based Clustering Quality Index for Categorical Data
Author :
Liu, Qingbao ; Dong, Guozhu
Author_Institution :
C4ISR Technol. Key Lab., Nat. Univ. of Defense Technol., Changsha, China
fYear :
2009
fDate :
6-9 Dec. 2009
Firstpage :
860
Lastpage :
865
Abstract :
Since clustering is unsupervised and highly explorative, clustering validation (i.e. assessing the quality of clustering solutions) has been an important and long standing research problem. Existing validity measures have significant shortcomings. This paper proposes a novel contrast pattern based clustering quality index (CPCQ) for categorical data, by utilizing the quality and diversity of the contrast patterns (CPs) which contrast the clusters in clusterings. High quality CPs can characterize clusters and discriminate them against each other. Experiments show that the CPCQ index (1) can recognize that expert-determined classes are the best clusters for many datasets from the UCI repository; (2) does not give inappropriate preference to larger number of clusters; (3) does not require a user to provide a distance function.
Keywords :
data handling; pattern clustering; CPCQ index; categorical data; contrast pattern based clustering quality index; Computer science; Data analysis; Data engineering; Data mining; Databases; Frequency; Hamming distance; Noise measurement; Pattern recognition; USA Councils; Clustering validation; clustering quality index; contrast pattern;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location :
Miami, FL
ISSN :
1550-4786
Print_ISBN :
978-1-4244-5242-2
Electronic_ISBN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2009.105
Filename :
5360324
Link To Document :
بازگشت