• DocumentCode
    571332
  • Title

    A New Method of Computing Chinese Word Similarity Based on Statistics

  • Author

    Zhang, Bo ; Hong, Lei ; Song, Shubin ; He, Liang ; Li, Guorong

  • Author_Institution
    Dept. of Comput. Sci. & Technol., East China Normal Univ., Shanghai, China
  • fYear
    2012
  • fDate
    18-21 Aug. 2012
  • Firstpage
    43
  • Lastpage
    46
  • Abstract
    Word semantic similarity is a very subjective concept and it is very difficult to get a similarity value close to human judgment. Chinese word semantic similarity research is relatively scarce due to its inherent complexity. This paper presents an approach to compute Chinese word semantic similarity based on statistical methods with word frequency contrast introduced (WFC-WS). Word semantic vectors are first obtained using co-occurrence and then extended with HIT-IR Tongyici Cilin (Extended). Word frequency contrast is introduced to filter the semantic vectors. Experiments show that the results of WFC-WS are closer to artificial standard compared with some similar methods.
  • Keywords
    information filtering; natural language processing; statistical analysis; word processing; Chinese word semantic similarity; HIT-IR; WFC; WS; co-occurrence; statistical method; word frequency contrast; word semantic vector filtering; Dictionaries; Humans; Semantics; Standards; Vectors; Semantic similarity; Tongyici Cilin; co-occurrence;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Business Intelligence and Financial Engineering (BIFE), 2012 Fifth International Conference on
  • Conference_Location
    Lanzhou
  • Print_ISBN
    978-1-4673-2092-4
  • Type

    conf

  • DOI
    10.1109/BIFE.2012.17
  • Filename
    6305076