DocumentCode :
2794320
Title :
Improved iterative pruning principal component analysis with graph-theoretic hierarchical clustering
Author :
Amornbunchornvej, C. ; Limpiti, T. ; Assawamakin, A. ; Intarapanich, A. ; Tongsima, S.
Author_Institution :
Fac. of Eng., King Mongkut´´s Inst. of Technol. Ladkrabang, Bangkok, Thailand
fYear :
2012
fDate :
16-18 May 2012
Firstpage :
1
Lastpage :
4
Abstract :
Various unsupervised clustering algorithms have been used to infer population structure in genetic data. The goals are to separate individuals of similar genetic characteristics into clusters and to estimate the number of clusters within each dataset. Among them, a framework called iterative pruning principal component analysis (ipPCA) have been developed. It performs PCA iteratively on subsets of data samples and clusters them using fuzzy c-mean. We believe that the choice of model-based clustering method affects the individual assignments and cluster quality, as well as the estimated number of clusters. Thus, in this paper we introduce a hierarchical tree clustering concept from graph theory, whose performance is independent of cluster shapes, into the ipPCA framework. We also add a PCA-based feature selection technique as a data pre-processing step to reduce data dimension and increase computational efficiency. The resulting algorithm is called HiClust-ipPCA. We illustrate the improved clustering results of the HiClust-ipPCA algorithm using 47-breed bovine and 28-breed sheep datasets.
Keywords :
iterative methods; pattern clustering; principal component analysis; trees (mathematics); HiClust-ipPCA algorithm; PCA-based feature selection technique; bovine datasets; cluster quality; computational efficiency; data dimension reduction; fuzzy c-mean clustering; genetic characteristics; genetic data; graph theory; graph-theoretic hierarchical clustering; hierarchical tree clustering concept; ipPCA framework; iterative pruning principal component analysis; model-based clustering method; population structure; sheep datasets; unsupervised clustering algorithms; Bovine; Clustering algorithms; Clustering methods; Genetics; Principal component analysis; Shape; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), 2012 9th International Conference on
Conference_Location :
Phetchaburi
Print_ISBN :
978-1-4673-2026-9
Type :
conf
DOI :
10.1109/ECTICon.2012.6254120
Filename :
6254120
Link To Document :
بازگشت