DocumentCode :
3466432
Title :
Integrating Semantic Knowledge into Text Similarity and Information Retrieval
Author :
Müller, Christof ; Gurevych, Iryna ; Muhlhauser, Max
Author_Institution :
Darmstadt Univ. of Technol., Darmstadt
fYear :
2007
fDate :
17-19 Sept. 2007
Firstpage :
257
Lastpage :
264
Abstract :
This paper studies the influence of lexical semantic knowledge upon two related tasks: ad-hoc information retrieval and text similarity. For this purpose, we compare the performance of two algorithms: (i) using semantic relatedness, and (ii) using a conventional extended Boolean model [12]. For the evaluation, we use two different test collections in the German language: (i) GIRT [5] for the information retrieval task, and (ii) a collection of descriptions of professions built to evaluate a system for electronic career guidance in the information retrieval and text similarity task. We found that integrating lexical semantic knowledge improves performance for both tasks. On the GIRT corpus, the performance is improved only for short queries. The performance on the collection of professional descriptions is improved, but crucially depends on the preprocessing of natural language essays employed as topics.
Keywords :
computational linguistics; information retrieval; natural languages; text analysis; Boolean model; GIRT corpus; German language; ad-hoc information retrieval; lexical semantic knowledge; natural language essays; semantic relatedness; text similarity; Electronic equipment testing; Engineering profession; Information retrieval; Natural languages; Pervasive computing; Strontium; System testing; Thesauri; Vocabulary; Writing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantic Computing, 2007. ICSC 2007. International Conference on
Conference_Location :
Irvine, CA
Print_ISBN :
978-0-7695-2997-4
Type :
conf
DOI :
10.1109/ICSC.2007.12
Filename :
4338357
Link To Document :
بازگشت