Context thesaurus for the extraction of metadata from medical research papers

Author

Shepherd, Michael ; Watters, Carolyn ; Young, June

Author_Institution

Fac. of Comput. Sci., Dalhousie Univ., Halifax, NS, Canada

fYear

2004

fDate

5-8 Jan. 2004

Abstract

Much of the academic literature available on the Web has never been adequately catalogued. Consequently, even using large-scale search engines, much of it remains inaccessible to researchers as indexing on this scale lacks the necessary detail to cope with discipline dependent terminologies and ontologies. Metadata has become a popular means to provide such information within known domains. In this paper, we describe an approach to the automatic extraction of metadata from medical research papers. Medical research papers tend to have stereotypic prescribed sections, such as introduction, methods, and conclusions. The approach described uses context thesauri and the semantic structure of the documents to extract metadata based on these stereotypic sections.

Keywords

cataloguing; data mining; medical administrative data processing; meta data; ontologies (artificial intelligence); thesauri; World Wide Web; automatic extraction; context thesaurus; medical research papers; metadata; ontology; search engines; stereotypic sections; Computer science; Data mining; Indexing; Large-scale systems; Natural languages; Ontologies; Search engines; Semantic Web; Terminology; Thesauri;

fLanguage

English

Publisher

ieee

Conference_Titel

System Sciences, 2004. Proceedings of the 37th Annual Hawaii International Conference on

Print_ISBN

0-7695-2056-1

Type

conf

DOI

10.1109/HICSS.2004.1265359

Filename

1265359