• DocumentCode
    2370531
  • Title

    Effectiveness of information extraction, multi-relational, and semi-supervised learning for predicting functional properties of genes

  • Author

    Krogel, Mark-A ; Scheffer, Tobias

  • Author_Institution
    FIN/IWS, Univ. of Magdeburg, Germany
  • fYear
    2003
  • fDate
    19-22 Nov. 2003
  • Firstpage
    569
  • Lastpage
    572
  • Abstract
    We focus on the problem of predicting functional properties of the proteins corresponding to genes in the yeast genome. Our goal is to study the effectiveness of approaches that utilize all data sources that are available in this problem setting, including unlabeled and relational data, and abstracts of research papers. We study transduction and co-training for using unlabeled data. We investigate a propositionalization approach which uses relational gene interaction data. We study the benefit of information extraction for utilizing a collection of scientific abstracts. The studied tasks are KDD Cup tasks of 2001 and 2002. The solutions which we describe achieved the highest score for task 2 in 2001, the fourth rank for task 3 in 2001, the highest score for one of the two subtasks and the third place for the overall task 2 in 2002.
  • Keywords
    data mining; information retrieval; learning (artificial intelligence); relational databases; co-training; gene functional property prediction; information extraction; multirelational data; propositionalization approach; relational gene interaction data; semisupervised learning; unlabeled data; yeast genome; Abstracts; Bioinformatics; Computer science; Data mining; Fungi; Genetics; Genomics; Hidden Markov models; Proteins; Semisupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
  • Print_ISBN
    0-7695-1978-4
  • Type

    conf

  • DOI
    10.1109/ICDM.2003.1250979
  • Filename
    1250979