Title :
Enhanced leukemia cancer classifier algorithm
Author :
Abd El-Nasser, Ahmed ; Shaheen, Mahboob ; El-Deeb, Hesham
Author_Institution :
Comput. Sci. Dept., Modern Acad. in Maadi, Cairo, Egypt
Abstract :
The development of data mining applications such as classification and clustering has shown the need for machine learning algorithms to be applied to large scale data. Cancer classification has improved over the past 20 years; there has been no general approach for identifying new cancer classes or for assigning tumors to known classes (class prediction). Most proposed cancer classification methods are from the statistical and machine learning area, ranging from the old nearest neighbor analysis, to the new support vector machines. There is no single classifier that is superior over the rest. A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemia as a test case. A class discovery procedure automatically discovered the distinction between acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) with previous knowledge of these classes. There are two main objectives of this research, the first is to introduce the design and implementation of SMIG (Select Most Informative Genes) Algorithm, and the second objective is to design and Implement Enhanced Classification algorithm (ECA) system to enhance Leukemia cancer classification using SMIG module and ranking procedure. The proposed approach and experiments showed that after conducting the preprocessing and the classification using the proposed ECA system it can be reached in 0.1 s time the accuracy of 98% which is better when compared to previous techniques in previously published studies.
Keywords :
cancer; data mining; learning (artificial intelligence); medical computing; statistical analysis; support vector machines; tumours; ALL; AML; DNA microarrays; ECA system; SMIG; SMIG module; acute lymphoblastic leukemia; acute myeloid leukemia; cancer classification methods; data mining applications; enhanced leukemia cancer classifier algorithm; gene expression monitoring; human acute leukemia; implement enhanced classification algorithm; machine learning algorithms; nearest neighbor analysis; select most informative genes algorithm; statistical learning; support vector machines; tumors; Accuracy; Cancer; Classification algorithms; Decision trees; Gene expression; Machine learning algorithms; Support vector machines; Bioinformatics; Classification; DNA; Data Mining; Leukemia;
Conference_Titel :
Science and Information Conference (SAI), 2014
Conference_Location :
London
Print_ISBN :
978-0-9893-1933-1
DOI :
10.1109/SAI.2014.6918222