• DocumentCode
    3461950
  • Title

    MetaExtract: an NLP system to automatically assign metadata

  • Author

    Yilmazel, Ozgur ; Finneran, Christina M. ; Liddy, Elizabeth D.

  • Author_Institution
    Center for Natural Language Process., Syracuse Univ., NY, USA
  • fYear
    2004
  • fDate
    7-11 June 2004
  • Firstpage
    241
  • Lastpage
    242
  • Abstract
    We have developed MetaExtract, a system to automatically assign Dublin Core + GEM metadata using extraction techniques from our natural language processing research. MetaExtract is comprised of three distinct processes: eQuery and HTML-based extraction modules and a keyword generator module. We conducted a Web-based survey to have users evaluate each metadata element´s quality. Only two of the elements, title and keyword, were shown to be significantly different, with the manual quality slightly higher. The remaining elements for which we had enough data to test were shown not to be significantly different; they are: description, grade, duration, essential resources, pedagogy-teaching method, and pedagogy-group.
  • Keywords
    hypermedia markup languages; information retrieval; meta data; natural language interfaces; natural languages; HTML-based extraction module; MetaExtract system; NLP; Web-based survey; eQuery; information extraction technique; keyword generator module; metadata; natural language processing system; Artificial intelligence; Data mining; Educational activities; HTML; Knowledge based systems; Measurement standards; Natural language processing; Permission; Software libraries; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Libraries, 2004. Proceedings of the 2004 Joint ACM/IEEE Conference on
  • Print_ISBN
    1-58113-832-6
  • Type

    conf

  • DOI
    10.1109/JCDL.2004.1336129
  • Filename
    1336129