Title :
Measuring Similarity between Sentence Fragments
Author :
Guangyuan Huang ; Jianqiang Sheng
Author_Institution :
State-Province Joint Lab. of Digital Home Interactive Applic., Sun Yat-sen Univ., Guangzhou, China
Abstract :
Sentence fragment has a wide range of applications, such as short text mining, flow diagram search based on label similarity and so on. Existing methods aren´t entirely appropriate for measuring similarity between sentence fragments since they were originally designed for complete sentences or long texts. So we pay more attention to proper nouns which carry important information in sentence fragments. We then propose a novel measuring method applicable for sentence fragments or even short sentences. It calculates the similarity based on the edit distance model instead of traditional vector space model. Besides, manual weight factors are introduced in order to meet the needs of different situations. Our experiments demonstrate that our method outperforms existing methods.
Keywords :
data mining; natural language processing; text analysis; edit distance model; flow diagram search; label similarity; manual weight factors; measuring method; sentence fragments; short text mining; vector space model; Accuracy; Artificial intelligence; Humans; Joints; Semantics; Syntactics; Vectors; Sentence fragment; degree of matching; edit distance; measuring similarity; proper nouns;
Conference_Titel :
Intelligent Human-Machine Systems and Cybernetics (IHMSC), 2012 4th International Conference on
Conference_Location :
Nanchang, Jiangxi
Print_ISBN :
978-1-4673-1902-7
DOI :
10.1109/IHMSC.2012.88