Author_Institution :
Key Lab. of Symbolic Comput. & Knowledge Eng. attached to the Minist. of Educ., Jilin Univ., Changchun, China
Abstract :
Measuring semantic similarity between words is a classical problem in nature language processing, the result of which can promote many applications such as machine translation, word sense disambiguation, ontology mapping, computational linguistics, etc. This paper combines knowledge-based methods with statistical methods in measuring words similarity, the novel aspect of which is that subjective Bayes method is employed. Firstly, extract evidences based on Word Net, secondly, analyze reasonableness of candidate evidence using scatter plot, thirdly, generate sufficiency measure by statistics and piecewise linear interpolation technique, fourthly, obtain comprehensive posteriori by integrating uncertainty reasoning with conclusion uncertainty synthetic strategy, finally, we quantify word semantic similarity. On data set R&G (65), we conducted experiment through 5-fold cross validation, and the correlation of our experimental results with human judgment is 0.912, with 0.4% improvements over existing best practice, which show that using subjective Bayes method to measure word semantic similarity is reasonable and effective.
Keywords :
Bayes methods; inference mechanisms; interpolation; knowledge based systems; natural language processing; statistical analysis; uncertainty handling; Word Net; candidate evidence reasonableness analysis; conclusion uncertainty synthetic strategy; evidence extraction; knowledge-based methods; natural language processing; piecewise linear interpolation technique; scatter plot; statistical methods; subjective Bayes method; uncertainty reasoning; word semantic similarity measurement; Bayes methods; Cognition; Correlation; Interpolation; Market research; Semantics; Uncertainty; Piecewise linear interpolation; Scatter Plot; Subjective Bayes; Word Semantic Similarity; WordNet;