DocumentCode :
1938453
Title :
Study of Clustering Algorithm Based on Model Data
Author :
Li, Kai ; Cui, Li-juan
Author_Institution :
HeBei Univ., Baoding
Volume :
7
fYear :
2007
fDate :
19-22 Aug. 2007
Firstpage :
3961
Lastpage :
3964
Abstract :
Clustering technique is a key tool in data mining and pattern recognition. Usually, objects for some traditional clustering algorithms are expressed in the form of vectors, which consist of some components to be described as features. However, objects in real tasks may be some models which are clustered other than data points, for example neural networks, decision trees, support vector machines, etc. This paper studies the clustering algorithm based on model data. By defining the extended measure, clustering methods are studied for the abstract data objects. Framework of clustering algorithm for models is presented. To validate the effectiveness of models clustering algorithm, we choose the hierarchical model clustering algorithm in the experiments. Models in clustering algorithm are BP (back propagation) neural networks and learning method is BP algorithm. Measures are chosen as both same-fault measure and double-fault measure for pairwise of models. Distances between clusters are the single link and the complete link, respectively. By this way, we may obtain part of neural network models which are from each cluster and improve diversity of neural network models. Then, part of models is ensembled. Moreover, we also study the relations between the number of clusters in clustering analysis, the size of ensemble learning, and performance of ensemble learning by experiments. Experimental results show that performance of ensemble learning by choosing part of models using clustering of models is improved.
Keywords :
backpropagation; neural nets; pattern clustering; abstract data objects; backpropagation neural networks; data mining; ensemble learning; hierarchical model clustering algorithm; learning method; pattern recognition; Clustering algorithms; Clustering methods; Cybernetics; Decision trees; Euclidean distance; Iterative algorithms; Machine learning; Neural networks; Partitioning algorithms; Support vector machines; Diversity; Measure space; Model clustering; Validation of clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2007 International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-0973-0
Electronic_ISBN :
978-1-4244-0973-0
Type :
conf
DOI :
10.1109/ICMLC.2007.4370838
Filename :
4370838
Link To Document :
بازگشت