Title :
Semantic Pattern Tree Kernels for Short-Text Classification
Author :
Kim, Kwanho ; Chung, Beom-Suk ; Choi, Ye Rim ; Park, Jonghun
Author_Institution :
Inf. Manage. Lab., Seoul Nat. Univ., Seoul, South Korea
Abstract :
Kernel methods are widely used for document classification in diverse domains. Popular kernels such as bag-of-word kernels and tree kernels show satisfactory results in classifying documents such as articles, e-mails or web pages. However, they provide less satisfactory performances in classifying short-text documents since the short documents have insufficient feature space. In order to cope with the problem, this paper presents a novel kernel function called semantic pattern tree kernel for classifying short-text documents. The proposed kernel extends the feature space of each document by incorporating syntactic and semantic information using three levels of semantic annotations. Experiments on the Open Directory Project dataset show that in classifying short-text documents the semantic pattern tree kernels achieve higher accuracy than the conventional kernels.
Keywords :
pattern classification; text analysis; trees (mathematics); document feature space; open directory project dataset; semantic annotations; semantic information; semantic pattern tree kernel; short text document classification; syntactic information; Accuracy; Educational institutions; Information management; Kernel; Semantics; Syntactics; Vectors; document classification; kernel methods; open directory project; semantic; short-text document; support vector machine;
Conference_Titel :
Dependable, Autonomic and Secure Computing (DASC), 2011 IEEE Ninth International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4673-0006-3
DOI :
10.1109/DASC.2011.202