Title :
Building a search engine model with morphological normalization support
Author :
Jure Mijic;Bojana Dalbelo Basic;Jan Snajder
Author_Institution :
Faculty of Electrical Engineering and Computing, University of Zagreb, Unska 3, 10000, Croatia
Abstract :
Searching a collection of documents can seem like an easy task, but manipulating textual data can be difficult because the data are mostly unstructured. We undertook the task of building an effective search engine for a collection of Croatian legislative documents. The developed search engine model supports multiple modules for information retrieval. To improve the effectiveness of the retrieval, we used a morphological normalization module that uses an inflectional lexicon automatically acquired from a document corpus. As we do not have a gold standard for our legislative document collection, we evaluated our search engine on three English test collections, explored the effects of stemming, and compared the results to the vector space model.
Keywords :
"Search engines","Indexes","Databases","Information retrieval","Buildings","Law","Indexing"
Conference_Titel :
Information Technology Interfaces, 2008. ITI 2008. 30th International Conference on
Print_ISBN :
978-953-7138-12-7
DOI :
10.1109/ITI.2008.4588481