Title :
Extracting very simple diagnostic rules from microarray data
Author :
Wang, Lipo ; Chu, Feng
Author_Institution :
Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore, Singapore
fDate :
Aug. 31 2010-Sept. 4 2010
Abstract :
We present an approach to deriving very simple classification rules from microarray data by first selecting very small gene subsets that can ensure highly accurate classification of cancers. Finding such minimum gene subsets can greatly reduce the computational load and “noise” arising from irrelevant genes. The derived simple classification rules allow for accurate diagnosis without the need for any classifiers. This work can simplify gene expression tests by including only a very small number of genes rather than thousands or tens of thousands of genes, which can significantly bring down the cost for cancer testing. These studies also call for further investigations into possible biological relationship between these small number of genes and cancer development and treatment. For example, we report the following simple, and yet 100% accurate, diagnostic rules involving only 2 genes to separate the 3 types of lymphoma patients: the patient has diffuse large B-cell lymphoma (DLBCL), if and only if the expression level of gene GENE1622X is greater than -0.75; the patient has chronic lymphocytic leukaemia (CLL), if and only if the expression level of gene GENE540X is less than -1; and the patient has follicular lymphoma (FL) otherwise, i.e., if and only if the expression level of gene GENE1622X is less than -0.75 and the expression level of gene GENE540X is greater than -1.
Keywords :
cancer; cellular biophysics; genetics; medical computing; molecular biophysics; patient diagnosis; patient treatment; cancer testing; cancers; chronic lymphocytic leukaemia; diagnostic rules; diffuse large B-cell lymphoma; follicular lymphoma; gene GENE1622X; gene expression tests; gene subsets; lymphoma patients; microarray data; noise; patient treatment; Accuracy; Bioinformatics; Cancer; Fuzzy neural networks; Gene expression; Testing; Training data; Algorithms; Decision Support Systems, Clinical; Diagnosis, Computer-Assisted; Humans; Neoplasm Proteins; Neoplasms; Oligonucleotide Array Sequence Analysis; Pattern Recognition, Automated; Reproducibility of Results; Sensitivity and Specificity; Tumor Markers, Biological;
Conference_Titel :
Engineering in Medicine and Biology Society (EMBC), 2010 Annual International Conference of the IEEE
Conference_Location :
Buenos Aires
Print_ISBN :
978-1-4244-4123-5
DOI :
10.1109/IEMBS.2010.5626565