• DocumentCode
    2822835
  • Title

    Variable selection in statistical models using population-based incremental learning with applications to genome-wide association studies

  • Author

    Nguyen, Hien Duy ; Wood, Ian A.

  • Author_Institution
    Sch. of Math. & Phys., Univ. of Queensland, St. Lucia, QLD, Australia
  • fYear
    2012
  • fDate
    10-15 June 2012
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Variable selection is the problem of choosing the subset of explanatory variables for a regression or classification model such that the resulting model is best according to some criterion. Here we consider the use of population-based incremental learning (PBIL) to select the variables for a linear regression model to predict a quantitative trait in living organisms. The data here is simulated to represent a genome-wide association study (GWAS) using single nucleotide polymorphisms (SNPs) as explanatory variables and height as an example trait. PBIL was effective in optimizing a variety of model fitness criteria. The resulting models were found to have true positive and false negative rates comparable to those of competing methods.
  • Keywords
    bioinformatics; data mining; learning (artificial intelligence); pattern classification; regression analysis; GWAS; PBIL; SNP; bioinformatics; classification model; data mining; genome-wide association studies; linear regression model; living organisms; machine learning; model fitness criteria; population-based incremental learning; single nucleotide polymorphisms; statistical models; variable selection; Accuracy; Biological system modeling; Input variables; Linear regression; Predictive models; Prototypes; Vectors; GWAS; PBIL; linear regression; variable selection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation (CEC), 2012 IEEE Congress on
  • Conference_Location
    Brisbane, QLD
  • Print_ISBN
    978-1-4673-1510-4
  • Electronic_ISBN
    978-1-4673-1508-1
  • Type

    conf

  • DOI
    10.1109/CEC.2012.6256577
  • Filename
    6256577