DocumentCode :
506848
Title :
Similarity Computation of Low-frequency Chinese Words
Author :
Fan, Xinghua ; Chen, Ji
Author_Institution :
Coll. of Comput. Sci. & Technol., Univ. of Posts & Telecommun., Chongqing, China
Volume :
1
fYear :
2009
fDate :
14-16 Aug. 2009
Firstpage :
524
Lastpage :
528
Abstract :
This paper proposes a novel method on Chinese low-frequency word similarity computation. It adopts a combinational strategy to compute word similarity, which exploits dictionary Hownet and constructed corpus retrieved from Internet. It has 3 steps: (1) If both of two words exist in Hownet, the similarity between them is computed based on Hownet. (2) If either of two words a and b doesn´t exist in Hownet, we respectively use word a, word b and word pair a and b as a query to search on the Internet and construct a corpus with the search results. Similarity between two words is computed based on the context of words. (3) In order to guarantee that similarities computed based on different sources are comparable, the similarity computed based on constructed corpus is multiplied by a coefficient. Experimental results show that the proposed method has effectively solved the problem of computing low-frequency word similarity.
Keywords :
Internet; query formulation; Chinese low frequency word similarity computation; Internet; constructed corpus; dictionary Hownet; Computer science; Educational institutions; Frequency shift keying; Fuzzy systems; Internet; Paper technology; Search engines; Statistics; Taxonomy; Telecommunication computing; constructed corpus; low frequency; word similarity;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2009. FSKD '09. Sixth International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-0-7695-3735-1
Type :
conf
DOI :
10.1109/FSKD.2009.476
Filename :
5358532
Link To Document :
بازگشت