• DocumentCode
    1653838
  • Title

    A New Language Model Combining Single and Compound Terms

  • Author

    Hammache, Arezki ; Ahmed-Ouamer, Rachid ; Boughanem, Mohand

  • Author_Institution
    Lab. LARI, Univ. Mouloud Mammeri, Tizi-Ouzou, Algeria
  • Volume
    1
  • fYear
    2011
  • Firstpage
    67
  • Lastpage
    70
  • Abstract
    Most traditional information retrieval systems are based on single terms indexing. However, it is admitted that semantic content of a document (or a query) cannot be accurately captured by a simple set of independent keywords. Although, several works have incorporated phrases or other syntactic information in IR, such attempts have shown slight benefit, at best. Particularly in language modeling approaches this is achieved through the use of the big ram or n-gram models. However, in these models all big rams/n-grams are considered and weighted uniformly. In this paper we introduce a new approach to weight and consider only certain types of N-grams "compound terms". Experimental results on three test collections showed an improvement.
  • Keywords
    document handling; indexing; information retrieval systems; natural language processing; bigram models; compound terms; document semantic content; information retrieval systems; language modeling approach; n-gram models; single terms indexing; Compounds; Computational modeling; Data models; Hidden Markov models; Indexing; Information retrieval; Smoothing methods; compound term indexing; information retrieval; language model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
  • Conference_Location
    Lyon
  • Print_ISBN
    978-1-4577-1373-6
  • Electronic_ISBN
    978-0-7695-4513-4
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2011.52
  • Filename
    6040498