• DocumentCode
    9789
  • Title

    Network-Based Methods to Identify Highly Discriminating Subsets of Biomarkers

  • Author

    Sajjadi, Seyed Javad ; Xiaoning Qian ; Bo Zeng ; Adl, Amin Ahmadi

  • Author_Institution
    Dept. of Ind. & Manage. Syst. Eng., Univ. of South Florida, Tampa, FL, USA
  • Volume
    11
  • Issue
    6
  • fYear
    2014
  • fDate
    Nov.-Dec. 1 2014
  • Firstpage
    1029
  • Lastpage
    1037
  • Abstract
    Complex diseases such as various types of cancer and diabetes are conjectured to be triggered and influenced by a combination of genetic and environmental factors. To integrate potential effects from interplay among underlying candidate factors, we propose a new network-based framework to identify effective biomarkers by searching for groups of synergistic risk factors with high predictive power to disease outcome. An interaction network is constructed with node weights representing individual predictive power of candidate factors and edge weights capturing pairwise synergistic interactions among factors. We then formulate this network-based biomarker identification problem as a novel graph optimization model to search for multiple cliques with maximum overall weight, which we denote as the Maximum Weighted Multiple Clique Problem (MWMCP). To achieve optimal or near optimal solutions, both an analytical algorithm based on column generation method and a fast heuristic for large-scale networks have been derived. Our algorithms for MWMCP have been implemented to analyze two biomedical data sets: a Type 1 Diabetes (T1D) data set from the Diabetes Prevention Trial-Type 1 (DPT-1) study, and a breast cancer genomics data set for metastasis prognosis. The results demonstrate that our network-based methods can identify important biomarkers with better prediction accuracy compared to the conventional feature selection that only considers individual effects.
  • Keywords
    cancer; data analysis; feature selection; genomics; medical computing; optimisation; analytical algorithm; biomarkers; biomedical data sets; breast cancer genomics data set; cancer; column generation method; complex diseases; conventional feature selection; diabetes prevention trial-type 1; disease outcome; environmental factors; fast heuristic large-scale networks; genetic factors; highly discriminating subsets; maximum overall weight; maximum weighted multiple clique problem; metastasis prognosis; near optimal solutions; network-based biomarker identification problem; network-based methods; novel graph optimization model; pairwise synergistic interactions; synergistic risk factors; type 1 diabetes data set; Bioinformatics; Biological system modeling; Biomarkers; Cancer; Computational biology; Diabetes; Diseases; Genomics; Statistical analysis; Maximum weighted multiple clique problem; column generation; discriminating biomarkers;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2325014
  • Filename
    6817569