DocumentCode :
1653838
Title :
A New Language Model Combining Single and Compound Terms
Author :
Hammache, Arezki ; Ahmed-Ouamer, Rachid ; Boughanem, Mohand
Author_Institution :
Lab. LARI, Univ. Mouloud Mammeri, Tizi-Ouzou, Algeria
Volume :
1
fYear :
2011
Firstpage :
67
Lastpage :
70
Abstract :
Most traditional information retrieval systems are based on single terms indexing. However, it is admitted that semantic content of a document (or a query) cannot be accurately captured by a simple set of independent keywords. Although, several works have incorporated phrases or other syntactic information in IR, such attempts have shown slight benefit, at best. Particularly in language modeling approaches this is achieved through the use of the big ram or n-gram models. However, in these models all big rams/n-grams are considered and weighted uniformly. In this paper we introduce a new approach to weight and consider only certain types of N-grams "compound terms". Experimental results on three test collections showed an improvement.
Keywords :
document handling; indexing; information retrieval systems; natural language processing; bigram models; compound terms; document semantic content; information retrieval systems; language modeling approach; n-gram models; single terms indexing; Compounds; Computational modeling; Data models; Hidden Markov models; Indexing; Information retrieval; Smoothing methods; compound term indexing; information retrieval; language model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
Conference_Location :
Lyon
Print_ISBN :
978-1-4577-1373-6
Electronic_ISBN :
978-0-7695-4513-4
Type :
conf
DOI :
10.1109/WI-IAT.2011.52
Filename :
6040498
Link To Document :
بازگشت