Title :
Identification of Full and Partial Class Relevant Genes
Author :
Zhu, Zexuan ; Ong, Yew-Soon ; Zurada, Jacek M.
Author_Institution :
Coll. of Comput. Sci. & Software Eng., Shenzhen Univ., Shenzhen, China
Abstract :
Multiclass cancer classification on microarray data has provided the feasibility of cancer diagnosis across all of the common malignancies in parallel. Using multiclass cancer feature selection approaches, it is now possible to identify genes relevant to a set of cancer types. However, besides identifying the relevant genes for the set of all cancer types, it is deemed to be more informative to biologists if the relevance of each gene to specific cancer or subset of cancer types could be revealed or pinpointed. In this paper, we introduce two new definitions of multiclass relevancy features, i.e., full class relevant (FCR) and partial class relevant (PCR) features. Particularly, FCR denotes genes that serve as candidate biomarkers for discriminating all cancer types. PCR, on the other hand, are genes that distinguish subsets of cancer types. Subsequently, a Markov blanket embedded memetic algorithm is proposed for the simultaneous identification of both FCR and PCR genes. Results obtained on commonly used synthetic and real-world microarray data sets show that the proposed approach converges to valid FCR and PCR genes that would assist biologists in their research work. The identification of both FCR and PCR genes is found to generate improvement in classification accuracy on many microarray data sets. Further comparison study to existing state-of-the-art feature selection algorithms also reveals the effectiveness and efficiency of the proposed approach.
Keywords :
bioinformatics; cancer; genetic algorithms; genetics; lab-on-a-chip; molecular biophysics; patient diagnosis; Markov blanket; bioinformatics; cancer diagnosis; classification accuracy; feature selection; full class relevant features; gene selection; malignancies; memetic algorithm; microarray data; multiclass cancer classification; multiclass relevancy features; partial class relevant features; Bioinformatics; Feature/Gene selection; Markov blanket.; Microarray; Multiclass cancer classification; feature/gene selection; memetic algorithm; microarray; multiclass cancer classification; Algorithms; Computational Biology; Computer Simulation; Databases, Genetic; Genes; Humans; Markov Chains; Neoplasms; Oligonucleotide Array Sequence Analysis; Tumor Markers, Biological;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2008.105