DocumentCode :
2248831
Title :
PCA based sequential feature space learning for gene selection
Author :
Yang, Jing-Lin ; Li, Han-Xiong
Author_Institution :
Dept. of Manuf. Eng. & Eng. Manage., City Univ. of Hong Kong, Hong Kong, China
Volume :
6
fYear :
2010
fDate :
11-14 July 2010
Firstpage :
3079
Lastpage :
3084
Abstract :
The expression of genes could be used for tumor subtype classification, clinical diagnosis and prognosis outcome prediction, but the underlying mechanism remains unknown. It is possible for data-based machine learning method to be employed for phenotype classification problem. But high dimensionality and small sample size make many machine learning methods fail. In this research, a PCA based sequential feature space learning method is proposed for gene selection. A two level feature selection process is conducted. In the first level PCA decomposition is conducted to obtain the orthogonal axis, and then features are projected and evaluated on the orthogonal axis. In second level, the features that have large projections are selected to form the feature space. Then the projections of all features onto the feature space are evaluated. Only features that have large projections both on orthogonal axis and feature subspace are selected as the feature subset. Then a neural network (NN) is employed to learn the classification model. The PCA based feature space learning is processed in a sequential manner until the classification performance is under pre-specified threshold and stable. The proposed methods have been applied to two gene microarray databases and showing good results.
Keywords :
feature extraction; gene therapy; learning (artificial intelligence); medical image processing; neural nets; principal component analysis; NN; PCA based sequential feature space learning method; clinical diagnosis; data-based machine learning method; gene microarray databases; gene selection; neural network; orthogonal axis; phenotype classification problem; prognosis; tumor subtype classification; Artificial neural networks; Cancer; Gene expression; Machine learning; Principal component analysis; Training; Feature Selection; Gene Expressions; Microarray; PCA;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
Conference_Location :
Qingdao
Print_ISBN :
978-1-4244-6526-2
Type :
conf
DOI :
10.1109/ICMLC.2010.5580720
Filename :
5580720
Link To Document :
بازگشت