• DocumentCode
    2498163
  • Title

    A hybrid framework for genome wide epistasis discovery

  • Author

    Tan, Zehao ; Zhang, Zhuo ; Liu, Jiang ; Kwoh, Chee Keong ; Ong, Sim Heng ; Teo, Yik Ying ; Khor, Chiea Chuen ; Tai, E. Shyong ; Aung, Tin ; Vithana, Eranga ; Wong, Tien Yin

  • Author_Institution
    Infocomm Res., A*STAR, Singapore, Singapore
  • fYear
    2011
  • fDate
    Aug. 30 2011-Sept. 3 2011
  • Firstpage
    6479
  • Lastpage
    6482
  • Abstract
    A hybrid framework integrating Random Forest and Logistic Regression is proposed and implemented for genome-wide epistasis study. The two-stage approach first uses random forest model to capture a pool of epistasis-prone single nucleotide polymorphisms (SNPs), followed by using logistic regression to identify the significant pair-wise epistasis SNPs. We tested the proposed framework on data obtained from Singapore Malay Eye Study (SiMES), in which, 3280 subjects were genotyped on Illumina 610 quad arrays and optic nerve parameters were measured in ocular examination. Case-control data set is labeled by choosing the high/low end of vertical Cup-to-Disc ratio (vCDR) values which is a measure of optic nerve degeneration. Our method identified 230 pairs of interacting SNPs with P-values below 5×10-8. A preliminary search identified a protein interaction network at a high confidence score of 0.9. The proteins are known to participate in the WNT pathway with involvement in the survival and differentiation of the retina ganglion cells, inferring a strong association with vCDR. The experimental results demonstrate that the proposed framework is valid and efficient for large scale epistatsis study.
  • Keywords
    cellular biophysics; decision trees; eye; genomics; medical computing; molecular biophysics; neurophysiology; proteins; random processes; regression analysis; Singapore Malay eye study; WNT pathway; cell differentiation; cell survival; genome wide epistasis discovery; hybrid framework; illumina 610 quad array; logistic regression; ocular examination; optic nerve degeneration; optic nerve parameter; pairwise epistasis; protein interaction network; random forest model; retina ganglion cell; single nucleotide polymorphism; vertical cup-to-disc ratio value; Bioinformatics; Diseases; Genomics; Logistics; Optical variables measurement; Proteins; Vegetation; Case-Control Studies; Computational Biology; Epistasis, Genetic; Genetic Predisposition to Disease; Genome-Wide Association Study; Genotype; Humans; Models, Genetic; Models, Statistical; Optic Nerve; Optic Nerve Diseases; Polymorphism, Single Nucleotide; Regression Analysis; Retinal Ganglion Cells; Wnt Proteins;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Engineering in Medicine and Biology Society, EMBC, 2011 Annual International Conference of the IEEE
  • Conference_Location
    Boston, MA
  • ISSN
    1557-170X
  • Print_ISBN
    978-1-4244-4121-1
  • Electronic_ISBN
    1557-170X
  • Type

    conf

  • DOI
    10.1109/IEMBS.2011.6091599
  • Filename
    6091599