Title :
PCA based sequential feature space learning for gene selection
Author :
Yang, Jing-Lin ; Li, Han-Xiong
Author_Institution :
Dept. of Manuf. Eng. & Eng. Manage., City Univ. of Hong Kong, Hong Kong, China
Abstract :
The expression of genes could be used for tumor subtype classification, clinical diagnosis and prognosis outcome prediction, but the underlying mechanism remains unknown. It is possible for data-based machine learning method to be employed for phenotype classification problem. But high dimensionality and small sample size make many machine learning methods fail. In this research, a PCA based sequential feature space learning method is proposed for gene selection. A two level feature selection process is conducted. In the first level PCA decomposition is conducted to obtain the orthogonal axis, and then features are projected and evaluated on the orthogonal axis. In second level, the features that have large projections are selected to form the feature space. Then the projections of all features onto the feature space are evaluated. Only features that have large projections both on orthogonal axis and feature subspace are selected as the feature subset. Then a neural network (NN) is employed to learn the classification model. The PCA based feature space learning is processed in a sequential manner until the classification performance is under pre-specified threshold and stable. The proposed methods have been applied to two gene microarray databases and showing good results.
Keywords :
feature extraction; gene therapy; learning (artificial intelligence); medical image processing; neural nets; principal component analysis; NN; PCA based sequential feature space learning method; clinical diagnosis; data-based machine learning method; gene microarray databases; gene selection; neural network; orthogonal axis; phenotype classification problem; prognosis; tumor subtype classification; Artificial neural networks; Cancer; Gene expression; Machine learning; Principal component analysis; Training; Feature Selection; Gene Expressions; Microarray; PCA;
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
Conference_Location :
Qingdao
Print_ISBN :
978-1-4244-6526-2
DOI :
10.1109/ICMLC.2010.5580720