DocumentCode :
1993058
Title :
Biomedical Literature Mining
Author :
Hu, Xiaohua
Author_Institution :
Drexel Univ., Philadelphia
fYear :
2007
fDate :
14-17 Oct. 2007
Firstpage :
1446
Lastpage :
1446
Abstract :
Despite an influx of molecular data in the form of sequences, structure, transcription profiles etc., most of the protein interaction information relevant to cell biology research still exists strictly in the scientific literature which is written in a natural language that computers cannot easily manipulate. Automatically mining and extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. In this talk, we present a novel approach Bio-IEDM (Biomedical Information Extraction and Data Mining) to integrate text mining and predictive modeling to analyze biomolecular network from biomedical literature databases. Our method consists of two phases. In phase 1, we discuss a semi-supervised efficient learning approach to automatically extract biological relationships such as protein-protein interaction, protein-gene interaction from the biomedical literature databases to construct the biomolecular network. In phase 2, we present a novel clustering algorithm to analyze the biomolecular network graph to identify biologically meaningful subnetworks (communities). The clustering algorithm considers the characteristics of the scale-free network graphs and is based on the local density of the vertex and its neighborhood functions that can be used to find more meaningful clusters with different density level. The experimental results indicate our approach is very effective in extracting biological knowledge from a huge collection of biomedical literatures. The integration of data mining and information extraction provides a promising direction for analyzing the biomolecular network.
Keywords :
biology computing; cellular biophysics; complex networks; data mining; genetics; molecular biophysics; proteins; Bio-IEDM; biomedical information extraction; biomedical literature databases; biomolecular network graph; cell biology; clustering algorithm; data mining; protein-gene interaction; scale-free network graphs; Biological cells; Biology computing; Biomedical computing; Clustering algorithms; Data mining; Databases; Information analysis; Natural languages; Proteins; Sequences;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4244-1509-0
Type :
conf
DOI :
10.1109/BIBE.2007.4375765
Filename :
4375765
Link To Document :
بازگشت