Title :
Cancer classification using clustering based gene selection and artificial neural networks
Author :
Rahideh, Akbar ; Shaheed, M. Hasan
Author_Institution :
Sch. of Electr. & Electron. Eng., Shiraz Univ. of Technol., Shiraz, Iran
Abstract :
In this investigation, a cancer classification approach is presented using clustering based gene selection and artificial neural networks. To address the so called `curse of dimensionality´ a T-statistic feature selection method, one of the univariate filter techniques, is used to select the most informative genes. However, instead of selecting a small group of relevant genes at once from the whole range of data, the genes are clustered into a number of groups and then the intended gene subset is formed incorporating top ranked members from each group. This process is adopted not only to ensure the selection of the most relevant and informative genes but also to bring information diversity in the selected genes. Three different clustering algorithms, namely, K-means clustering, Fuzzy C-means clustering and self-organizing map (SOM) are used. Samples classification is then carried out using a multi-layered perceptron (MLP) neural network trained with the Levenberg-Marquardt algorithm. The performance of the approach is evaluated in terms of accuracy, sensitivity and specificity and found to be comparable with that of the non-clustering based approach.
Keywords :
cancer; genetics; learning (artificial intelligence); medical computing; multilayer perceptrons; pattern classification; pattern clustering; self-organising feature maps; set theory; statistical analysis; Levenberg-Marquardt algorithm; MLP artificial neural network training; SOM; T-statistic feature selection method; cancer classification approach; clustering-based gene selection; fuzzy c-means clustering; gene subset; k-means clustering; multilayered perceptron artificial neural network training; self-organizing map; univariate filter techniques; Accuracy; Cancer; Gene expression; Neural networks; Neurons; Training; Vectors; Cancer; classification; clustering; feature selection; gene expression; microarray data; neural network;
Conference_Titel :
Control, Instrumentation and Automation (ICCIA), 2011 2nd International Conference on
Conference_Location :
Shiraz
Print_ISBN :
978-1-4673-1689-7
DOI :
10.1109/ICCIAutom.2011.6356828