DocumentCode :
2156145
Title :
Automatic Extraction of Bibliographic Information from Biomedical Online Journal Articles Using a String Matching Algorithm
Author :
Kim, Jongwoo ; Le, Daniel X. ; Thoma, George R.
Author_Institution :
Nat. Libr. of Medicine, Bethesda, MD
fYear :
0
fDate :
0-0 0
Firstpage :
905
Lastpage :
912
Abstract :
A system has been developed to extract bibliographic data (grant numbers and databank accession numbers) from online biomedical journal articles for the National Library of Medicine´s MEDLINEreg database. Rule-based algorithms and a string matching algorithm are proposed to extract the bibliographic data from HTML-formatted articles. Experiments conducted with 411 medical articles from 73 journal issues show an accuracy exceeding 96%
Keywords :
bibliographic systems; information retrieval; knowledge based systems; medical information systems; string matching; MEDLINE database; automatic extraction; bibliographic information; biomedical online journal articles; databank accession numbers; grant numbers; rule-based algorithms; string matching algorithm; Data mining; Databases; Genetics; HTML; Labeling; Libraries; Mars; Production; Protein sequence; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer-Based Medical Systems, 2006. CBMS 2006. 19th IEEE International Symposium on
Conference_Location :
Salt Lake City, UT
ISSN :
1063-7125
Print_ISBN :
0-7695-2517-1
Type :
conf
DOI :
10.1109/CBMS.2006.55
Filename :
1647685
Link To Document :
بازگشت