Title :
Classification of micro-array gene expression data using neural networks
Author :
Tian, David ; Burley, Keith
Author_Institution :
Dept. of Comput., Sheffield Hallam Univ., Sheffield, UK
Abstract :
Classification of yeast genes based on their expression levels obtained from micro array hybridization experiments is an important and challenging application domain in data mining and knowledge discovery. Over the past decade, neural networks and support vector machines (SVMs) have achieved good results for genes classification. This paper presents a methodology which uses two neural networks to classify unseen genes based on their expression levels. In order to remove some of the noise and deal with the imbalanced class distribution of the dataset, data pre-processing is firstly performed before data classification in which data cleaning, data transformation and data over-sampling using SMOTE algorithm are undertaken. Thereafter, two neural networks with different architectures are trained using Scaled Conjugate Gradient in two different ways: 1) the training-validation-testing approach and 2) 10-fold cross-validation. Experimental results show that this methodology outperforms the previous best-performing SVM for this problem and 8 other classifiers: 3 SVMs, C4.5, Bayesian network, Naive Bayes, K-NN and JRip.
Keywords :
biology computing; data mining; genetics; neural nets; pattern classification; support vector machines; SMOTE algorithm; data classification; data pre-processing; micro-array gene expression; neural networks; scaled conjugate gradient; support vector machines; Artificial neural networks;
Conference_Titel :
Neural Networks (IJCNN), The 2010 International Joint Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-6916-1
DOI :
10.1109/IJCNN.2010.5596568