Title :
Smoothing blemished gene expression microarray data via missing value imputation
Author :
Cai, Zhipeng ; Shi, Yi ; Song, Meng ; Goebel, Randy ; Lin, Guohui
Author_Institution :
Department of Computing Science, University of Alberta. Edmonton, T6G 2E8, Canada
Abstract :
Gene expression microarray technology has enabled advanced biological and medical research, but the data are well-recognized noisy and must be used with caution, since they are greatly affected by many experimental factors such as RNA concentration, spot typing, hybridization condition, and image analysis. It is highly desirable that the inaccurate data entries (“stains”) can be identified and subsequently curated. In this paper, we propose a novel computational method, based on feature gene selection and sample classification, to efficiently discover the stains and apply imputation methods to estimate their values. Extensive experimental results on three Affymetrix platforms for human cancer diagnosis showed that by picking only 1–4% data entries as the most likely stains, the smoothed datasets could be used for better downstream data analyses such as robust biomarker identification and disease diagnosis.
Keywords :
Biomedical imaging; Cancer; Data analysis; Gene expression; Humans; Image analysis; Medical diagnostic imaging; RNA; Robustness; Smoothing methods; Algorithms; Artificial Intelligence; Diagnosis, Computer-Assisted; Gene Expression Profiling; Humans; Neoplasm Proteins; Neoplasms; Oligonucleotide Array Sequence Analysis; Pattern Recognition, Automated; Reproducibility of Results; Sample Size; Sensitivity and Specificity; Signal Processing, Computer-Assisted; Tumor Markers, Biological;
Conference_Titel :
Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th Annual International Conference of the IEEE
Conference_Location :
Vancouver, BC
Print_ISBN :
978-1-4244-1814-5
Electronic_ISBN :
1557-170X
DOI :
10.1109/IEMBS.2008.4650505