DocumentCode
1653838
Title
A New Language Model Combining Single and Compound Terms
Author
Hammache, Arezki ; Ahmed-Ouamer, Rachid ; Boughanem, Mohand
Author_Institution
Lab. LARI, Univ. Mouloud Mammeri, Tizi-Ouzou, Algeria
Volume
1
fYear
2011
Firstpage
67
Lastpage
70
Abstract
Most traditional information retrieval systems are based on single terms indexing. However, it is admitted that semantic content of a document (or a query) cannot be accurately captured by a simple set of independent keywords. Although, several works have incorporated phrases or other syntactic information in IR, such attempts have shown slight benefit, at best. Particularly in language modeling approaches this is achieved through the use of the big ram or n-gram models. However, in these models all big rams/n-grams are considered and weighted uniformly. In this paper we introduce a new approach to weight and consider only certain types of N-grams "compound terms". Experimental results on three test collections showed an improvement.
Keywords
document handling; indexing; information retrieval systems; natural language processing; bigram models; compound terms; document semantic content; information retrieval systems; language modeling approach; n-gram models; single terms indexing; Compounds; Computational modeling; Data models; Hidden Markov models; Indexing; Information retrieval; Smoothing methods; compound term indexing; information retrieval; language model;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
Conference_Location
Lyon
Print_ISBN
978-1-4577-1373-6
Electronic_ISBN
978-0-7695-4513-4
Type
conf
DOI
10.1109/WI-IAT.2011.52
Filename
6040498
Link To Document