DocumentCode :
2096073
Title :
Integrating Biomedical Publications with Existing Metadata
Author :
Nikolov, Nikolay ; Stoehr, Peter
Author_Institution :
Eur. Bioinf. Inst., Poznan
fYear :
2008
fDate :
17-19 June 2008
Firstpage :
653
Lastpage :
655
Abstract :
Currently biomedical literature is largely disconnected from its metadata. While there are freely accessible centralised metadata repositories the publications themselves are split among a large number of repositories. We address this problem by harvesting freely accessible biomedical publications from the Web and integrating them with the corresponding metadata. The system involves title recognition applied on the harvested publications using knowledge-based algorithm and a fuzzy match between the extracted title and the metadata records using edit distance metric. So far we were able to locate +300.000 publications on the Web and achieve +96% precision and nearly 85% recall on a random sample of 250 documents harvested from the Web.
Keywords :
Internet; electronic publishing; fuzzy set theory; knowledge based systems; medical computing; meta data; World Wide Web; biomedical literature; biomedical publication; fuzzy match; knowledge-based algorithm; metadata; Bioinformatics; Biomedical computing; Fuzzy systems; HTML; Indexing; Information retrieval; Text mining; Uniform resource locators; Web services; XML; data integration;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer-Based Medical Systems, 2008. CBMS '08. 21st IEEE International Symposium on
Conference_Location :
Jyvaskyla
ISSN :
1063-7125
Print_ISBN :
978-0-7695-3165-6
Type :
conf
DOI :
10.1109/CBMS.2008.127
Filename :
4562076
Link To Document :
بازگشت