DocumentCode
3461950
Title
MetaExtract: an NLP system to automatically assign metadata
Author
Yilmazel, Ozgur ; Finneran, Christina M. ; Liddy, Elizabeth D.
Author_Institution
Center for Natural Language Process., Syracuse Univ., NY, USA
fYear
2004
fDate
7-11 June 2004
Firstpage
241
Lastpage
242
Abstract
We have developed MetaExtract, a system to automatically assign Dublin Core + GEM metadata using extraction techniques from our natural language processing research. MetaExtract is comprised of three distinct processes: eQuery and HTML-based extraction modules and a keyword generator module. We conducted a Web-based survey to have users evaluate each metadata element´s quality. Only two of the elements, title and keyword, were shown to be significantly different, with the manual quality slightly higher. The remaining elements for which we had enough data to test were shown not to be significantly different; they are: description, grade, duration, essential resources, pedagogy-teaching method, and pedagogy-group.
Keywords
hypermedia markup languages; information retrieval; meta data; natural language interfaces; natural languages; HTML-based extraction module; MetaExtract system; NLP; Web-based survey; eQuery; information extraction technique; keyword generator module; metadata; natural language processing system; Artificial intelligence; Data mining; Educational activities; HTML; Knowledge based systems; Measurement standards; Natural language processing; Permission; Software libraries; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Digital Libraries, 2004. Proceedings of the 2004 Joint ACM/IEEE Conference on
Print_ISBN
1-58113-832-6
Type
conf
DOI
10.1109/JCDL.2004.1336129
Filename
1336129
Link To Document