Title :
Quality benchmarking relational databases and Lucene in the TREC4 adhoc task environment
Author :
Arslan, Ahmet ; Yilmazel, Ozgur
Author_Institution :
Comput. Eng. Dept., Anadolu Univ., Eskisehir, Turkey
Abstract :
The present work covers a comparison of the text retrieval qualities of open source relational databases and Lucene, which is a full text search engine library, over English documents. TREC-4 adhoc task is completed to compare both search effectiveness and search efficiency. Two relational database management systems and four different well-known English stemming algorithms have been tried. It has been found that language specific preprocessing improves retrieval quality for all systems. The results of the English text retrieval experiments by using Lucene are at par with top six results presented at TREC-4 automatic adhoc. Although open source relational databases integrated full text retrieval technology, their relevancy ranking mechanisms are not as good as Lucene´s.
Keywords :
benchmark testing; database management systems; information retrieval; public domain software; relational databases; search engines; text analysis; Lucene; TREC4 adhoc task environment; english document; english stemming algorithm; english text retrieval; language specific preprocessing; open source relational databases; quality benchmarking relational database; relational database management system; retrieval quality; search engine library; text retrieval quality; Indexes; Libraries; Natural languages; Relational databases; Search engines;
Conference_Titel :
Computer Science and Information Technology (IMCSIT), Proceedings of the 2010 International Multiconference on
Conference_Location :
Wisla
Print_ISBN :
978-1-4244-6432-6
DOI :
10.1109/IMCSIT.2010.5679643