Title :
Exploiting Wikipedia for Directional Inferential Text Similarity
Author :
Wee, Leong Chee ; Hassan, Samer
Author_Institution :
Univ. of Delaware, Newark
Abstract :
In natural languages, variability of semantic expression refers to the situation where the same meaning can be inferred from different words or texts. Given that many natural language processing tasks nowadays (e.g. question answering, information retrieval, document summarization) often model this variability by requiring a specific target meaning to be inferred from different text variants, it is helpful to capture text similarity in a directional manner to serve such inference needs. In this paper, we show how Wikipedia can be used as a semantic resource to build a directional inferential similarity metric between words, and subsequently, texts. Through experiments, we show that our Wikipedia-based metric performs significantly better when applied to a standard evaluation dataset, with a reduction in error rate of 16.1% over the random metric baseline.
Keywords :
Internet; natural language processing; text analysis; Wikipedia-based metric; directional inferential text similarity; natural language processing; Computer science; Encyclopedias; Error analysis; Humans; Information retrieval; Information technology; Natural language processing; Natural languages; Performance evaluation; Wikipedia; Directional; Inference; Semantic; Similarity; Wikipedia;
Conference_Titel :
Information Technology: New Generations, 2008. ITNG 2008. Fifth International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
0-7695-3099-0
DOI :
10.1109/ITNG.2008.190