• DocumentCode
    477745
  • Title

    A Hybrid Statistical Language Model Applied to the Domain Specific Information Retrieval

  • Author

    Wang, Wei ; Lin, Kunhui ; Zhou, Changle

  • Author_Institution
    Software Sch., Xiamen Univ., Xiamen
  • Volume
    2
  • fYear
    2008
  • fDate
    18-20 Oct. 2008
  • Firstpage
    3
  • Lastpage
    7
  • Abstract
    The traditional language model takes the multi-topics document corpus as the research target. In order to avoid the interference brought by the multi-topics problem, this paper focuses on the domain specific Information Retrieval (IR). In domain specific IR, different terms are considered to take different contribution degrees to the final query result. So the terms in a document can be divided into different categories according to their contribution degrees. And the statistical information of a term, mainly its probabilities, is computed by different methods and smooth strategies according to its category. This paper proposed an improved hybrid statistical language model used in the Domain Specific IR. This new model has about 9%~10% performance increment in the experimental result. In the end, some challenges and research orientation of the statistical language model research are presented.
  • Keywords
    information retrieval; probability; query languages; statistical analysis; domain specific information retrieval; multitopics document corpus; probability; query language; smooth strategy; statistical information; statistical language model; Computer science; Fuzzy systems; Handwriting recognition; Information retrieval; Interference; Natural languages; Probability distribution; Space technology; Speech recognition; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
  • Conference_Location
    Shandong
  • Print_ISBN
    978-0-7695-3305-6
  • Type

    conf

  • DOI
    10.1109/FSKD.2008.240
  • Filename
    4666069