Title :
Learning to identify core term of knowledge unit from short text
Author :
Tian, Zhenhua ; Wang, Zhen ; Liu, Ziqi ; Xiang, Hengheng ; Liu, Jun ; Zheng, Qinghua
Author_Institution :
Dept. of Comput. Sci. & Technol., Xi´´an Jiaotong Univ., Xi´´an, China
Abstract :
We present a new task of identifying core term (CT) of knowledge unit (KU) from text for knowledge management and service. Two kinds of approaches, including binary classification using naïve bayesian, decision tree, logistic regression and SVM, as well as competition learning based on pairwise classification, are investigated for this specific task, combined with presented rich feature set from position, token features to statistic and linguistic features. Experimental results show that simple classification method can effectively address this task with desirable performance at 82.7% KU accuracy. However, since the recognition of core term relies on the KU as an integer and all its inner terms, competition learning based on pairwise classification achieves better result at 89.6%. We also empirically show that all of the presented types of features are useful for our task, and the combination of position and linguistic features is essential for information extraction on short text.
Keywords :
Bayes methods; decision trees; knowledge management; pattern classification; regression analysis; support vector machines; text analysis; unsupervised learning; KU; SVM; binary classification; competition learning; core term identification; core term recognition; decision tree; information extraction; knowledge management and service; knowledge unit; linguistic features; logistic regression; naïve Bayesian; pairwise classification; position features; short text; statistic features; token feature; Data mining; Feature extraction; Measurement; Pragmatics; Support vector machines; Training; Vectors; Classification; Core Term Identification; Knowledge Discovery; Knowledge Management and Service; Knowledge Unit; Text Mining;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
Conference_Location :
Sichuan
Print_ISBN :
978-1-4673-0025-4
DOI :
10.1109/FSKD.2012.6233797