DocumentCode
2658890
Title
The effects of domain knowledge relations on domain text classification
Author
Lu, Han ; Zhengtao, Yu ; Jinhui, Deng ; Cheng, Zhang ; Cunli, Mao ; Jianyi, Guo
Author_Institution
Sch. of Inf. Eng. & Autom., Kunming Univ. of Sci. & Technol., Kunming
fYear
2008
fDate
16-18 July 2008
Firstpage
460
Lastpage
463
Abstract
The text classification usually uses the statistical method to select characteristic. When it is carried out in different domains, the special interior knowledge relationships between domains will not be considered. In this paper, a new text classification model is proposed, which is based on the domain knowledge relations. This model adopts the support vector machine study algorithm, combine statistic samples and domain terminology to make up classification feature space, and calculate the similarity between domain conceptions, so that classification characteristic is entrusted with certain weight, realizing domain text classification. The new model has been made use of to carry out a text classification experiment about YunNan travel domain and non-travel domain. The result shows that domain knowledge has great effects on domain text classification and the accuracy of classification has been improved by 4 percentage compared with the improved TFIDF method.
Keywords
pattern classification; statistical analysis; support vector machines; text analysis; YunNan travel domain; domain knowledge relations; domain text classification; statistic samples; statistical method; support vector machine study algorithm; Application software; Automation; Computer applications; Electronic mail; Information processing; Laboratories; Statistical analysis; Support vector machine classification; Support vector machines; Text categorization; Domain Knowledge Relations; Domain Text Classification; Feature Selection; Weight Calculation;
fLanguage
English
Publisher
ieee
Conference_Titel
Control Conference, 2008. CCC 2008. 27th Chinese
Conference_Location
Kunming
Print_ISBN
978-7-900719-70-6
Electronic_ISBN
978-7-900719-70-6
Type
conf
DOI
10.1109/CHICC.2008.4605079
Filename
4605079
Link To Document