• DocumentCode
    438956
  • Title

    An integrated system for class prediction using gene expression profiling

  • Author

    Chen, Dehang ; Hua, D.D. ; Liu, Zhenqiu ; Cheng, Zhi-Fu

  • Author_Institution
    Preventive Medicine & Biometrics, Uniformed Services Univ. of the Health Sci., Bethesda, MD, USA
  • Volume
    2
  • fYear
    2004
  • fDate
    6-9 Dec. 2004
  • Firstpage
    1023
  • Abstract
    Gene expression profiles have been successfully applied to class prediction. Due to a large number of genes (features) and a small number of samples in gene expression data, feature selection is essential when performing the prediction task. Many methods have been proposed to select features in microarray data analysis, but there is no unique method which performs uniformly well for all the learning algorithms. It is then practical to find a feature selection method and a learning algorithm that give superior performance. In this paper, we present an integrated scheme to perform the task of class prediction based on gene expression profiles. The scheme incorporates a simple novel feature selection procedure into naive Bayes models. Each selected gene has a high score of discriminatory power determined by the Brown-Forsythe test statistics. Any pair of selected genes have a low correlation. This facilitates the use of the conditional independence among genes assumed by the naive Bayes models. To demonstrate the effectiveness, the proposed scheme was applied to three commonly used expression data sets COLON, OVARIAN, and LEUKEMIA. The results show that the numbers of misclassified samples are 0, and 4, respectively.
  • Keywords
    Bayes methods; data analysis; feature extraction; genetics; learning (artificial intelligence); medical computing; medical image processing; statistical analysis; Brown-Forsythe test statistics; COLON expression data set; LEUKEMIA expression data set; OVARIAN expression data set; class prediction; discriminatory power; feature selection method; gene expression profiling; integrated system; learning algorithms; microarray data analysis; naive Bayes models; Biometrics; Bridges; Computer science; Data analysis; Filters; Gene expression; Hospitals; Statistical analysis; Telemedicine; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control, Automation, Robotics and Vision Conference, 2004. ICARCV 2004 8th
  • Print_ISBN
    0-7803-8653-1
  • Type

    conf

  • DOI
    10.1109/ICARCV.2004.1468984
  • Filename
    1468984