DocumentCode :
3001165
Title :
Comparative analysis of similarity measures for sentence level semantic measurement of text
Author :
Saad, Shaharil Mad ; Kamarudin, Siti Sakira
Author_Institution :
Product Quality & Reliability Eng., MIMOS Berhad, Kuala Lumpur, Malaysia
fYear :
2013
fDate :
Nov. 29 2013-Dec. 1 2013
Firstpage :
90
Lastpage :
94
Abstract :
The accuracy of similarity measurement between sentences is critical to the performance of several applications such as text mining, question answering, and text summarization. This paper focuses on calculating semantic similarities between sentences and performing a comparative analysis among identified similarity measurement techniques. Comparison between three popular similarity measurements which are Jaccard, Cosine and Dice similarity measures has been conducted. The performance of each identified measurement was evaluated and recorded. In this paper, we use a large lexical database of English known as WordNet to calculate the word-to-word semantic similarity. The result of this research concludes that the Jaccard and Dice performs better in measuring the semantic similarity between sentences.
Keywords :
database management systems; natural language processing; text analysis; Cosine similarity measure; Dice similarity measure; English lexical database; Jaccard similarity measure; WordNet; comparative analysis; sentence semantic similarity; similarity measurement technique; text sentence level semantic measurement; word-to-word semantic similarity; Benchmark testing; Conferences; Control systems; Information retrieval; Measurement techniques; Semantics; Vectors; Semantic Similarity; Sentence Similarity; Similarity Measurement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control System, Computing and Engineering (ICCSCE), 2013 IEEE International Conference on
Conference_Location :
Mindeb
Print_ISBN :
978-1-4799-1506-4
Type :
conf
DOI :
10.1109/ICCSCE.2013.6719938
Filename :
6719938
Link To Document :
بازگشت