DocumentCode :
3461950
Title :
MetaExtract: an NLP system to automatically assign metadata
Author :
Yilmazel, Ozgur ; Finneran, Christina M. ; Liddy, Elizabeth D.
Author_Institution :
Center for Natural Language Process., Syracuse Univ., NY, USA
fYear :
2004
fDate :
7-11 June 2004
Firstpage :
241
Lastpage :
242
Abstract :
We have developed MetaExtract, a system to automatically assign Dublin Core + GEM metadata using extraction techniques from our natural language processing research. MetaExtract is comprised of three distinct processes: eQuery and HTML-based extraction modules and a keyword generator module. We conducted a Web-based survey to have users evaluate each metadata element´s quality. Only two of the elements, title and keyword, were shown to be significantly different, with the manual quality slightly higher. The remaining elements for which we had enough data to test were shown not to be significantly different; they are: description, grade, duration, essential resources, pedagogy-teaching method, and pedagogy-group.
Keywords :
hypermedia markup languages; information retrieval; meta data; natural language interfaces; natural languages; HTML-based extraction module; MetaExtract system; NLP; Web-based survey; eQuery; information extraction technique; keyword generator module; metadata; natural language processing system; Artificial intelligence; Data mining; Educational activities; HTML; Knowledge based systems; Measurement standards; Natural language processing; Permission; Software libraries; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Libraries, 2004. Proceedings of the 2004 Joint ACM/IEEE Conference on
Print_ISBN :
1-58113-832-6
Type :
conf
DOI :
10.1109/JCDL.2004.1336129
Filename :
1336129
Link To Document :
بازگشت