شماره ركورد كنفرانس :
4602
عنوان مقاله :
Selection of informative genes and classification of breast cancer microarray data using SOM and genetic algorithm
پديدآورندگان :
Abolghasemi Roghayeh Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences, Zanjan, Iran , Vasighi Mahdi vasighi@iasbs.ac.ir Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences, Zanjan, Iran , Hadi-Alijanvand Hamid Department of Biological Sciences, Institute for Advanced Studies in Basic Sciences, Zanjan, Iran
تعداد صفحه :
1
كليدواژه :
Genetic algorithm , Microarray , High dimensional datasets
سال انتشار :
1395
عنوان كنفرانس :
دومين همايش ملي زيست شناسي سلول سرطاني
زبان مدرك :
انگليسي
چكيده فارسي :
Microarray is a powerful technology for simultaneously analyzing thousands of genes and can be used as a tool to discover methods of treat in variety of diseases especially in cancer. The most challenging aspect in the microarray data analysis is the large number of genes versus the small number of samples. Thus, in order to provide a simpler and more accurate data model, feature selection and dimension reduction methods are used. In this study, a new method for finding the informative genes and classification of microarray data is proposed. Three main steps of the proposed method are as follow: i) In the first step, the dimension of the data is reduced by unsupervised SOM. ii) secondly, the optimal subsets containing neurons are extracted and neurons are ranked using a procedure based on genetic algorithm. iii) at the final step, informative genes are selected based on some criterion. The proposed method is applied on breast cancer microarray data which is one of the challenging datasets in this field to classify and select the most important genes. This dataset is included two type of luminal and non-luminal tumors and the proposed approach increased the average classification accuracy in 10-fold cross validation up to 97 percent for this dataset. Furthermore, the selected genes as informative ones like FYN, CD8A, GNLY, PLEKHF1, IL1R2, TBC1D3, KRTHB6, are effective factors in cancer due to the related biological functions in “Cell Mobility” and “Immune System”. Improvement in accuracy and biological evidence for selected genes show the potential of the proposed method for finding the most important features in high dimensional datasets.
كشور :
ايران
لينک به اين مدرک :
بازگشت