DocumentCode :
1989458
Title :
An experimental study of the effect of frequency of co-occurrence of features in clustering
Author :
Pai, Radhika M. ; Ananthanarayana, V.S.
Author_Institution :
Dept. of Comput. Sci. &Eng., MIT, Manipal
fYear :
2007
fDate :
12-15 Feb. 2007
Firstpage :
1
Lastpage :
4
Abstract :
In this paper, an attempt has been made to explore the effect of frequency of co-occurrence of features on the accuracy of the clustering results. This has been achieved by incorporating the frequency component in the clustering algorithm. The frequency, we mean here is the number of times the sequence of features appear in the data set. We try to utilize this component in the algorithm and study its effect on the resultant accuracy. The algorithm we have used is the PC(pattern count)-tree based clustering algorithm. The PC-tree is a compact and complete representation of the data set. It is data order independent and incremental. It can be applied to changing data and changing knowledge. i.e. dynamic databases. This algorithm is based on a compact data structure called PC-tree. The node of the PC-tree has, in addition to other fields a count field, which keeps track of the count of the number of features shared by the pattern. In the literature, the PC-tree was used for clustering and the count field was used only to retrieve back the transactions. In this paper, we try to make use of this field in clustering. We have also used the partitioned PC-tree based algorithm and studied the effect of frequency on the accuracy. We have conducted extensive experiments with the OCR handwritten digit dataset, a real dataset and observed the effect of frequency on the clustering results. The results of all our experiments are tabulated.
Keywords :
pattern clustering; tree data structures; clustering algorithm; co-occurrence frequency component; data structure; dynamic databases; pattern count-tree based clustering algorithm; Clustering algorithms; Computer science; Data analysis; Data structures; Frequency; Information technology; Optical character recognition software; Partitioning algorithms; Pattern recognition; Spatial databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International Symposium on
Conference_Location :
Sharjah
Print_ISBN :
978-1-4244-0778-1
Electronic_ISBN :
978-1-4244-1779-8
Type :
conf
DOI :
10.1109/ISSPA.2007.4555535
Filename :
4555535
Link To Document :
بازگشت