• DocumentCode
    3394099
  • Title

    A framework for the application of decision trees to the analysis of SNPs data

  • Author

    Fiaschi, Linda ; Garibaldi, Jonathan M. ; Krasnogor, Natalio

  • Author_Institution
    Sch. of Comput. Sci., Univ. of Nottingham, Nottingham
  • fYear
    2009
  • fDate
    March 30 2009-April 2 2009
  • Firstpage
    106
  • Lastpage
    113
  • Abstract
    Data mining is the analysis of experimental datasets to extract trends and relationships that can be meaningful for the user. In genetic studies these techniques have revealed interesting findings, especially in the heritable predisposition to contract specific diseases. One of these diseases which is still under extensive analysis is pre-eclampsia, a progressive disorder which occurs during pregnancy and soon after the birth, affecting both the mothers and their babies. There are many choices to be made in the application of the various data mining techniques that may be used to study general genotype-phenotype associations. The aim of this paper is to describe the general framework that we adopted in the application of decision tree algorithms to the analysis of SNPs data related to cases of pre-eclampsia. The results show the validity of this methodology to detect a subset of attributes associated with the predictable variable, providing a reduction in the size of the dataset. Moreover, from the clinical point of view, it confirmed the medical interpretation of the dasiacorrected birth-weight centilepsila (CBC) value of 10 being a meaningful cut-off and confirmed association between an infant´s CBC and the dasiaweek of deliverypsila parameter. We hope that the generic framework described here will be of use to other researchers analysing such data.
  • Keywords
    DNA; biology computing; data mining; decision trees; diseases; genetics; genomics; medical computing; obstetrics; paediatrics; SNP data analysis; corrected birth-weight centile; data mining; decision tree algorithms; genetic studies; genotype-phenotype associations; heritable diseases; pre-eclampsia; pregnancy; progressive disorder; single nucleotide polymorphism; Blood pressure; DNA; Data analysis; Data mining; Decision trees; Diseases; Genetics; Humans; Pediatrics; Pregnancy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence in Bioinformatics and Computational Biology, 2009. CIBCB '09. IEEE Symposium on
  • Conference_Location
    Nashville, TN
  • Print_ISBN
    978-1-4244-2756-7
  • Type

    conf

  • DOI
    10.1109/CIBCB.2009.4925715
  • Filename
    4925715