DocumentCode :
2028344
Title :
A study of features on Primary Question detection in Chinese online forums
Author :
Sun, Lin ; Liu, Bingquan ; Wang, Baoxun ; Zhang, Deyuan ; Wang, Xiaolong
Author_Institution :
MOE-MS Key Lab. of Natural Language Process. & Speech, Harbin Inst. of Technol., Harbin, China
Volume :
5
fYear :
2010
fDate :
10-12 Aug. 2010
Firstpage :
2422
Lastpage :
2427
Abstract :
Primary Question detection in online forum is a subtask of extracting question-answer pairs. In this paper, by surveying the forms of questions in Chinese online forums, a combination of textual and N-gram features achieved via feature selection is adopted to help detecting primary questions. By viewing primary question detection a binary classification problem, decision tree classifier C4.5 and support vector machine are introduced to distinguish questions from non-questions separately. Experimental results across multiple datasets demonstrate that the mixture of textual and N-gram features performs better than using each of them separately under both C4.5 and support vector machine. By computing the weight of each feature in the two classifiers, the top 6 features are found the very same except for a little adjustment of order, showing that the combination of textual and N-gram features is universal and effective in detecting primary questions.
Keywords :
classification; decision trees; information retrieval; natural language processing; support vector machines; Chinese online forums; N-gram features; binary classification; decision tree classifier; feature selection; primary question detection; question-answer pairs; support vector machine; textual features; Classification tree analysis; Electronic mail; Feature extraction; Speech; Support vector machines; N-gram feature; classification; information extraction; primary question detection; textual feature;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
Type :
conf
DOI :
10.1109/FSKD.2010.5569298
Filename :
5569298
Link To Document :
بازگشت