DocumentCode :
2719889
Title :
Short text understanding through lexical-semantic analysis
Author :
Wen Hua ; Zhongyuan Wang ; Haixun Wang ; Kai Zheng ; Xiaofang Zhou
Author_Institution :
Sch. of Inf., Renmin Univ. of China, Beijing, China
fYear :
2015
fDate :
13-17 April 2015
Firstpage :
495
Lastpage :
506
Abstract :
Understanding short texts is crucial to many applications, but challenges abound. First, short texts do not always observe the syntax of a written language. As a result, traditional natural language processing methods cannot be easily applied. Second, short texts usually do not contain sufficient statistical signals to support many state-of-the-art approaches for text processing such as topic modeling. Third, short texts are usually more ambiguous. We argue that knowledge is needed in order to better understand short texts. In this work, we use lexical-semantic knowledge provided by a well-known semantic network for short text understanding. Our knowledge-intensive approach disrupts traditional methods for tasks such as text segmentation, part-of-speech tagging, and concept labeling, in the sense that we focus on semantics in all these tasks. We conduct a comprehensive performance evaluation on real-life data. The results show that knowledge is indispensable for short text understanding, and our knowledge-intensive approaches are effective in harvesting semantics of short texts.
Keywords :
natural language processing; statistical analysis; text analysis; concept labeling; knowledge-intensive approach; lexical-semantic analysis; natural language processing method; part-of-speech tagging; short text harvesting semantics; short text understanding; statistical signals; text processing; text segmentation; topic modeling; Approximation algorithms; Companies; Context; Labeling; Semantics; Tagging; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2015 IEEE 31st International Conference on
Conference_Location :
Seoul
Type :
conf
DOI :
10.1109/ICDE.2015.7113309
Filename :
7113309
Link To Document :
بازگشت