• DocumentCode
    3628495
  • Title

    Building a search engine model with morphological normalization support

  • Author

    Jure Mijic;Bojana Dalbelo Basic;Jan Snajder

  • Author_Institution
    Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000, Croatia
  • fYear
    2008
  • Firstpage
    619
  • Lastpage
    624
  • Abstract
    Searching a collection of documents can seem like an easy task, but manipulating textual data can be difficult because the data are mostly unstructured. We undertook the task of building an effective search engine for a collection of Croatian legislative documents. The developed search engine model supports multiple modules for information retrieval. To improve the effectiveness of the retrieval, we used a morphological normalization module that uses an inflectional lexicon automatically acquired from a document corpus. As we do not have a gold standard for our legislative document collection, we evaluated our search engine on three English test collections, explored the effects of stemming, and compared the results to the vector space model.
  • Keywords
    "Search engines","Indexes","Databases","Information retrieval","Buildings","Law","Indexing"
  • Publisher
    ieee
  • Conference_Titel
    Information Technology Interfaces, 2008. ITI 2008. 30th International Conference on
  • ISSN
    1330-1012
  • Print_ISBN
    978-953-7138-12-7
  • Type

    conf

  • DOI
    10.1109/ITI.2008.4588481
  • Filename
    4588481