DocumentCode
2370531
Title
Effectiveness of information extraction, multi-relational, and semi-supervised learning for predicting functional properties of genes
Author
Krogel, Mark-A ; Scheffer, Tobias
Author_Institution
FIN/IWS, Univ. of Magdeburg, Germany
fYear
2003
fDate
19-22 Nov. 2003
Firstpage
569
Lastpage
572
Abstract
We focus on the problem of predicting functional properties of the proteins corresponding to genes in the yeast genome. Our goal is to study the effectiveness of approaches that utilize all data sources that are available in this problem setting, including unlabeled and relational data, and abstracts of research papers. We study transduction and co-training for using unlabeled data. We investigate a propositionalization approach which uses relational gene interaction data. We study the benefit of information extraction for utilizing a collection of scientific abstracts. The studied tasks are KDD Cup tasks of 2001 and 2002. The solutions which we describe achieved the highest score for task 2 in 2001, the fourth rank for task 3 in 2001, the highest score for one of the two subtasks and the third place for the overall task 2 in 2002.
Keywords
data mining; information retrieval; learning (artificial intelligence); relational databases; co-training; gene functional property prediction; information extraction; multirelational data; propositionalization approach; relational gene interaction data; semisupervised learning; unlabeled data; yeast genome; Abstracts; Bioinformatics; Computer science; Data mining; Fungi; Genetics; Genomics; Hidden Markov models; Proteins; Semisupervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN
0-7695-1978-4
Type
conf
DOI
10.1109/ICDM.2003.1250979
Filename
1250979
Link To Document