Title :
Feature set enhancement via hierarchical clustering for microarray classification
Author :
Bosio, Mattia ; Pujalte, Pau Bellot ; Salembier, Philippe ; Oliveras-Vergés, Albert
Author_Institution :
Dept. of Signal Theor. & Commun., Tech. Univ. of Catalonia, Barcelona, Spain
Abstract :
A new method for gene expression classification is proposed in this paper. In a first step, the original feature set is enriched by including new features, called metagenes, produced via hierarchical clustering. In a second step, a reliable classifier is built from a wrapper feature selection process. The selection relies on two criteria: the classical classification error rate and a new reliability measure. As a result, a classifier with good predictive ability using as few features as possible to reduce the risk of overfitting is obtained. This method has been tested on three public cancer datasets: leukemia, lymphoma and colon. The proposed method has obtained interesting classification results and the experiments have confirmed the utility of both metagenes and feature ranking criterion to improve the final classifier.
Keywords :
cancer; medical computing; pattern classification; pattern clustering; classical classification error rate; classifier; colon; feature set enhancement; gene expression classification; hierarchical clustering; leukemia; lymphoma; metagenes; microarray classification; overfitting risk reduction; public cancer datasets; reliability measure; wrapper feature selection process; Cancer; Clustering algorithms; Colon; Error analysis; Gene expression; Principal component analysis; Reliability; Treelet; cancer microarray classification; feature selection; hierarchical clustering;
Conference_Titel :
Genomic Signal Processing and Statistics (GENSIPS), 2011 IEEE International Workshop on
Conference_Location :
San Antonio, TX
Print_ISBN :
978-1-4673-0491-7
Electronic_ISBN :
2150-3001
DOI :
10.1109/GENSiPS.2011.6169486