DocumentCode
2376013
Title
A topic detection method based on Semantic Dependency Distance and PLSA
Author
Chen, Yan ; Yang, Yang ; Zhang, Huisan ; Zhu, Haiping ; Tian, Feng
Author_Institution
Dept. of Comput. Sci. & Technol., Xi´´an Jiaotong Univ., Xi´´an, China
fYear
2012
fDate
23-25 May 2012
Firstpage
703
Lastpage
708
Abstract
Topic detection is a hot topic in the field of text mining. In this paper, focusing on the Chinese interactive text, we explored a novel topic detection method, named SDD-PLSA, which integrates Semantic Dependency Distance (SDD) and PLSA. It not only has the advantages of PLSA, which is an efficient, effective method and is widely used in text mining, but also considers the semantic and syntax information. Thus, the problem of lacking semantic information in PLSA can be avoided. SDD-PLSA has two main steps. The first is using SDD to classify the sentences that have a high similarity in semantics into several groups according to semantic feature extraction of the interactive text. Then, a PLSA classifier is used upon the result of the first step. The experiments show that the accuracy of detection on `love´ topic has been improved to 64.8% when using SDD-PLSA, better than 55.4% when using PLSA.
Keywords
classification; computer aided instruction; data mining; feature extraction; statistical analysis; text analysis; Chinese interactive text; PLSA classifier; SDD-PLSA; e-Iearning systems; probabilistic latent semantic analysis; semantic dependency distance; semantic feature extraction; semantic information; sentence classification; syntax information; text mining; topic detection method; Pragmatics; Semantics; E-learning; Interactive Text; PLSA; Semantic Dependency Distance; Topic Detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Supported Cooperative Work in Design (CSCWD), 2012 IEEE 16th International Conference on
Conference_Location
Wuhan
Print_ISBN
978-1-4673-1211-0
Type
conf
DOI
10.1109/CSCWD.2012.6221895
Filename
6221895
Link To Document