DocumentCode
2949905
Title
BioWizard: Discovering and validating associations between biological entities by integrated analysis of scientific literature and experimental data
Author
Spampinato, Concetto ; Giordano, Daniela ; Kavasidis, Isaak ; Milardo, Sebastiano
Author_Institution
Dept. of Electr., Electron. & Comput. Eng., Univ. of Catania, Catania, Italy
fYear
2012
fDate
20-22 June 2012
Firstpage
1
Lastpage
6
Abstract
In this paper, we present BioWizard, a bioinformatics knowledge discovery tool for extracting and validating implicit associations between biological entities. By mining specialized scientific literature, BioWizard not only generates biological hypotheses in the form of associations between genes, proteins and diseases, but also validates the plausibility of such associations against high-throughput biological data (microarrays) and annotated databases. The main novelties of the proposed approach are that: (1) it infers associations between biological entities by mining full text papers instead of only abstracts as usually performed by the existing tools, (2) a named entity recognition that improves the precision of the derived associations by enriching the vocabularies used in the mining loop with terms extracted directly from the text and, (3) the inferred associations are filtered according to their evidence in experimental data. We tested the precision and the recall of our system in retrieving known-associations (which did not appear in the same document) from gold standards and the results shown the ability of BioWizard in retrieving valid associations, thus providing a valuable tool for the use of biomedical researchers to speed up scientific progress.
Keywords
bioinformatics; data mining; diseases; genetics; information retrieval; molecular biophysics; proteins; BioWizard; annotated databases; bioinformatics knowledge discovery tool; biological entities; biological hypothesis generation; diseases; experimental data; genes; high-throughput biological data; integrated analysis; known-association retrieval; microarray data; named entity recognition; proteins; scientific literature; text mining; valid association retrieval; Databases; Dictionaries; Diseases; Protein engineering; Proteins; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer-Based Medical Systems (CBMS), 2012 25th International Symposium on
Conference_Location
Rome
ISSN
1063-7125
Print_ISBN
978-1-4673-2049-8
Type
conf
DOI
10.1109/CBMS.2012.6266327
Filename
6266327
Link To Document