DocumentCode :
2792326
Title :
Study on the construction of domain text classification model with the help of domain knowledge
Author :
Yu, Zheng-tao ; Han, Lu ; Mao, Cun-li ; Guo, Jian-yi ; Meng, Xiang-yan ; Zhang, Zhi-kun
Author_Institution :
Sch. of Inf. Eng. & Autom., Kunming Univ. of Sci. & Technol., Kunming
Volume :
5
fYear :
2008
fDate :
12-15 July 2008
Firstpage :
2612
Lastpage :
2617
Abstract :
Traditional text classification model uses statistical methods to obtain features. But in the aspect of discrimination domain and non-domain text category, domain knowledge relations havenpsilat been taken account of in these methods. A domain text classification model was presented in this paper. This model used the support vector machine learning algorithm, gained domain classification feature words through statistic and union domain words, structured domain classification feature space. With the help of domain knowledge relations, computed relevance between domain concepts, got domain classification feature weight. Finally domain text classification was realized. An experiment in the Yunnan tourism domain was carried on to confirm that domain knowledge relations have a good influence on the domain text classification. The classification accuracy rate has been increased 0.04 than improved TFIDF method.
Keywords :
pattern classification; statistical analysis; support vector machines; text analysis; Yunnan tourism domain; domain knowledge; domain text classification model; statistical methods; structured domain classification feature space; support vector machine learning algorithm; union domain words; Application software; Computer applications; Cybernetics; Frequency; Information processing; Knowledge engineering; Laboratories; Machine learning; Statistics; Text categorization; Domain knowledge Relations; Feature selection; Text classification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2008 International Conference on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-2095-7
Electronic_ISBN :
978-1-4244-2096-4
Type :
conf
DOI :
10.1109/ICMLC.2008.4620849
Filename :
4620849
Link To Document :
بازگشت