• DocumentCode
    2775435
  • Title

    Integrating Knowledge in Search of Biologically Relevant Genes

  • Author

    Zhao, Zheng ; Sharma, Shashvata ; Agarwal, Nitin ; Liu, Huan ; Wang, Jiangxin ; Chang, Yung

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Arizona State Univ., Tempe, AZ, USA
  • fYear
    2009
  • fDate
    6-6 Dec. 2009
  • Firstpage
    88
  • Lastpage
    93
  • Abstract
    Gene selection aims at detecting biologically relevant genes to assist biologists\´ research. The cDNA microarray data used in gene selection is usually "wide". With more than ten thousand genes, but only less than a hundred of samples, many biologically irrelevant genes can gain their statistical relevance by sheer randomness. Moreover, even for genes that are biologically relevant, biologists often prefer the "trigger" to the "fire". Addressing these problems goes beyond what the cDNA microarray can offer and necessitates the use of additional information. Recent developments in bioinformatics have made various knowledge sources available, such as the KEGG pathway repository and gene ontology database. Integrating different types of knowledge for gene selection could provide more information about genes and samples. In this work, we propose a novel framework to integrate different types of knowledge for identifying biologically relevant genes. The framework converts different types of external knowledge to its internal knowledge, which can be used to rank genes. Upon obtaining the ranking lists, it aggregates them via a probabilistic model and generates a final ranking list. Experimental results from our study on acute lymphoblastic leukemia demonstrate the novelty and efficacy of the proposed framework and show that using different types of knowledge together can help detect biologically relevant genes.
  • Keywords
    biology computing; genetics; ontologies (artificial intelligence); probability; KEGG pathway repository; acute lymphoblastic leukemia; biologically relevant genes; cDNA microarray data; gene ontology database; gene selection; probabilistic model; statistical relevance; Bioinformatics; Biological processes; Biology; Clustering algorithms; Data mining; Databases; Fires; Machine learning algorithms; Ontologies; Pediatrics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on
  • Conference_Location
    Miami, FL
  • Print_ISBN
    978-1-4244-5384-9
  • Electronic_ISBN
    978-0-7695-3902-7
  • Type

    conf

  • DOI
    10.1109/ICDMW.2009.21
  • Filename
    5360522