Title :
Prediction of EST Functional Relationships via Literature Mining With User-Specified Parameters
Author :
Wang, Hei-Chia ; Huang, Tian-Hsiang
Author_Institution :
Inst. of Inf. Manage., Nat. Cheng Kung Univ., Tainan
fDate :
4/1/2009 12:00:00 AM
Abstract :
The massive amount of expressed sequence tags (ESTs) gathered over recent years has triggered great interest in efficient applications for genomic research. In particular, EST functional relationships can be used to determine a possible gene network for biological processes of interest. In recent years, many researchers have tried to determine EST functional relationships by analyzing the biological literature. However, it has been challenging to find efficient prediction methods. Moreover, an annotated EST is usually associated with many functions, so successful methods must be able to distinguish between relevant and irrelevant functions based on user specifications. This paper proposes a method to discover functional relationships between ESTs of interest by analyzing literature from the Medical Literature Analysis and Retrieval System Online, with user-specified parameters for selecting keywords. This method performs better than the multiple kernel documents method in setting up a specific threshold for gathering materials. The method is also able to uncover known functional relationships, as shown by a comparison with the Kyoto Encyclopedia of Genes and Genomes database. The reliable EST relationships predicted by the proposed method can help to construct gene networks for specific biological functions of interest.
Keywords :
bioinformatics; data mining; genetics; genomics; information retrieval systems; molecular biophysics; EST functional relationship; Kyoto Encyclopedia of Genes and Genomes database; Medical Literature Analysis and Retrieval System Online; expressed sequence tags; gene network; genomic research; literature mining; Biochemistry; Bioinformatics; Biological materials; Biological processes; Couplings; Databases; Encyclopedias; Genomics; Information management; Kernel; Organisms; Prediction methods; Expressed sequence tags (ESTs); Medical Literature Analysis and Retrieval System Online (MEDLINE); functional relationship; literature mining; Algorithms; Expressed Sequence Tags; Information Storage and Retrieval; MEDLINE; Models, Statistical; Random Allocation; Seeds; User-Computer Interface; Vocabulary, Controlled;
Journal_Title :
Biomedical Engineering, IEEE Transactions on
DOI :
10.1109/TBME.2008.2009765