Title :
Identifying Cancer Biomarkers From Microarray Data Using Feature Selection and Semisupervised Learning
Author :
Chakraborty, Debasis ; Maulik, Ujjwal
Author_Institution :
Murshidabad Coll. of Eng. & Technol., Berhampore, India
Abstract :
Microarrays have now gone from obscurity to being almost ubiquitous in biological research. At the same time, the statistical methodology for microarray analysis has progressed from simple visual assessments of results to novel algorithms for analyzing changes in expression profiles. In a micro-RNA (miRNA) or gene-expression profiling experiment, the expression levels of thousands of genes/miRNAs are simultaneously monitored to study the effects of certain treatments, diseases, and developmental stages on their expressions. Microarray-based gene expression profiling can be used to identify genes, whose expressions are changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues. Recent studies have revealed that patterns of altered microarray expression profiles in cancer can serve as molecular biomarkers for tumor diagnosis, prognosis of disease-specific outcomes, and prediction of therapeutic responses. Microarray data sets containing expression profiles of a number of miRNAs or genes are used to identify biomarkers, which have dysregulation in normal and malignant tissues. However, small sample size remains a bottleneck to design successful classification methods. On the other hand, adequate number of microarray data that do not have clinical knowledge can be employed as additional source of information. In this paper, a combination of kernelized fuzzy rough set (KFRS) and semisupervised support vector machine (S3VM) is proposed for predicting cancer biomarkers from one miRNA and three gene expression data sets. Biomarkers are discovered employing three feature selection methods, including KFRS. The effectiveness of the proposed KFRS and S3VM combination on the microarray data sets is demonstrated, and the cancer biomarkers identified from miRNA data are reported. Furthermore, biological significance tests are conducted for miRNA cancer biomarkers.
Keywords :
RNA; biomedical equipment; cancer; diseases; feature selection; fuzzy set theory; genetics; lab-on-a-chip; learning (artificial intelligence); medical diagnostic computing; molecular biophysics; patient diagnosis; patient monitoring; patient treatment; pattern classification; statistical analysis; support vector machines; tumours; KFRS; biological research; diseases; feature selection; gene-miRNA expression; kernelized fuzzy rough set; malignant tissues; miRNA cancer biomarkers; microRNA profiles; microarray analysis; microarray data; microarray-based gene expression; molecular biomarkers; patient monitoring; patient treatment; semisupervised learning; semisupervised support vector machine; statistical methodology; therapeutic responses; tumor diagnosis; tumor prognosis; Biomarkers; Cancer; Gene expression; Kernel; Support vector machines; Training; Tumors; Cancer biomarkers; feature selection; kernelized fuzzy rough set; microarray data; semisupervised SVM; successive filtering;
Journal_Title :
Translational Engineering in Health and Medicine, IEEE Journal of
DOI :
10.1109/JTEHM.2014.2375820