• DocumentCode
    3408924
  • Title

    Selection of patient samples and genes for outcome prediction

  • Author

    Liu, Huiqing ; Li, Jinyan ; Wong, Limsoon

  • Author_Institution
    Inst. for Infocomm Res., Singapore, Singapore
  • fYear
    2004
  • fDate
    16-19 Aug. 2004
  • Firstpage
    382
  • Lastpage
    392
  • Abstract
    Gene expression profiles with clinical outcome data enable monitoring of disease progression and prediction of patient survival at the molecular level. We present a new computational method for outcome prediction. Our idea is to use an informative subset of original training samples. This subset consists of only short-term survivors who died within a short period and long-term survivors who were still alive after a long follow-up time. These extreme training samples yield a clear platform to identify genes whose expression is related to survival. To find relevant genes, we combine two feature selection methods - entropy measure and Wilcoxon rank sum test - so that a set of sharp discriminating features are identified. The selected training samples and genes are then integrated by a support vector machine to build a prediction model, by which each validation sample is assigned a survival/relapse risk score for drawing Kaplan-Meier survival curves. We apply this method to two data sets: diffuse large-B-cell lymphoma (DLBCL) and primary lung adenocarcinoma. In both cases, patients in high and low risk groups stratified by our risk scores are clearly distinguishable. We also compare our risk scores to some clinical factors, such as International Prognostic Index score for DLBCL analysis and tumor stage information for lung adenocarcinoma. Our results indicate that gene expression profiles combined with carefully chosen learning algorithms can predict patient survival for certain diseases.
  • Keywords
    cancer; cellular biophysics; entropy; genetics; learning (artificial intelligence); lung; medical computing; molecular biophysics; patient monitoring; physiological models; support vector machines; tumours; International Prognostic Index score; Kaplan-Meier survival curves; Wilcoxon rank sum test; clinical outcome prediction; diffuse large-B-cell lymphoma; disease progression monitoring; entropy; feature selection methods; gene expression profiles; genes selection; learning algorithms; patient sample selection; patient survival; prediction model; primary lung adenocarcinoma; support vector machine; survival/relapse risk score; tumor stage information; Diseases; Entropy; Gene expression; Information analysis; Lungs; Patient monitoring; Predictive models; Risk analysis; Support vector machines; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
  • Print_ISBN
    0-7695-2194-0
  • Type

    conf

  • DOI
    10.1109/CSB.2004.1332451
  • Filename
    1332451