DocumentCode
524640
Title
A Clustering System for Gene Expression Data Based upon Genetic Programming and the HS-Model
Author
Liu, Guiquan ; Jiang, Xiufang ; Wen, Lingyun
Author_Institution
Key Lab. of Software in Comput. & Commun., Univ. of Sci. & Technol. of China, Hefei, China
Volume
1
fYear
2010
fDate
28-31 May 2010
Firstpage
238
Lastpage
241
Abstract
Cluster analysis is a major method to study gene function and gene regulation information for there is a lack of prior knowledge for gene data. Many clustering methods existed at present usually need manual operations or pre-determined parameters, which are difficult for gene data. Besides, gene data possess their own characteristics, such as large scale, high-dimension, and noise. Therefore, a systematic clustering algorithm should be proposed to effectively deal with gene data. In this paper, a novel genetic programming (GP) clustering system for gene data based on hierarchical statistical model (HS-model) is proposed. And an appropriate fitness function is also proposed in this system. This clustering system can largely eliminate the infection of data scale and dimension. The proposed GP clustering system is applied to cluster the whole intact yeast gene data without dimensionality reduction. The experimental results indicate that the algorithm is highly efficient and can effectively deal with missing values in gene dataset.
Keywords
Clustering algorithms; Clustering methods; Communication system software; Computer science; Electronic mail; Gene expression; Genetic programming; Information analysis; Manuals; Optimization methods; cluster analysis; fitness function; genetic programming; missing value;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Science and Optimization (CSO), 2010 Third International Joint Conference on
Conference_Location
Huangshan, Anhui, China
Print_ISBN
978-1-4244-6812-6
Electronic_ISBN
978-1-4244-6813-3
Type
conf
DOI
10.1109/CSO.2010.116
Filename
5532998
Link To Document