DocumentCode
1989458
Title
An experimental study of the effect of frequency of co-occurrence of features in clustering
Author
Pai, Radhika M. ; Ananthanarayana, V.S.
Author_Institution
Dept. of Comput. Sci. &Eng., MIT, Manipal
fYear
2007
fDate
12-15 Feb. 2007
Firstpage
1
Lastpage
4
Abstract
In this paper, an attempt has been made to explore the effect of frequency of co-occurrence of features on the accuracy of the clustering results. This has been achieved by incorporating the frequency component in the clustering algorithm. The frequency, we mean here is the number of times the sequence of features appear in the data set. We try to utilize this component in the algorithm and study its effect on the resultant accuracy. The algorithm we have used is the PC(pattern count)-tree based clustering algorithm. The PC-tree is a compact and complete representation of the data set. It is data order independent and incremental. It can be applied to changing data and changing knowledge. i.e. dynamic databases. This algorithm is based on a compact data structure called PC-tree. The node of the PC-tree has, in addition to other fields a count field, which keeps track of the count of the number of features shared by the pattern. In the literature, the PC-tree was used for clustering and the count field was used only to retrieve back the transactions. In this paper, we try to make use of this field in clustering. We have also used the partitioned PC-tree based algorithm and studied the effect of frequency on the accuracy. We have conducted extensive experiments with the OCR handwritten digit dataset, a real dataset and observed the effect of frequency on the clustering results. The results of all our experiments are tabulated.
Keywords
pattern clustering; tree data structures; clustering algorithm; co-occurrence frequency component; data structure; dynamic databases; pattern count-tree based clustering algorithm; Clustering algorithms; Computer science; Data analysis; Data structures; Frequency; Information technology; Optical character recognition software; Partitioning algorithms; Pattern recognition; Spatial databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International Symposium on
Conference_Location
Sharjah
Print_ISBN
978-1-4244-0778-1
Electronic_ISBN
978-1-4244-1779-8
Type
conf
DOI
10.1109/ISSPA.2007.4555535
Filename
4555535
Link To Document