DocumentCode :
2850629
Title :
Text classification by boosting weak learners based on terms and concepts
Author :
Bloehdorn, Stephan ; Hotho, Andreas
Author_Institution :
Inst. AIFB, Karlsruhe Univ., Germany
fYear :
2004
fDate :
1-4 Nov. 2004
Firstpage :
331
Lastpage :
334
Abstract :
Document representations for text classification are typically based on the classical bag-of-words paradigm. This approach comes with deficiencies that motivate the integration of features on a higher semantic level than single words. In this paper we propose an enhancement of the classical document representation through concepts extracted from background knowledge. Boosting is used for actual classification. Experimental evaluations on two well known text corpora support our approach through consistent improvement of the results.
Keywords :
classification; ontologies (artificial intelligence); text analysis; bag-of-words paradigm; document representations; text classification; weak learner boosting; Boosting; Data engineering; Data mining; Frequency; Information retrieval; Knowledge management; Learning systems; Ontologies; Text categorization; Tree data structures;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2004. ICDM '04. Fourth IEEE International Conference on
Print_ISBN :
0-7695-2142-8
Type :
conf
DOI :
10.1109/ICDM.2004.10077
Filename :
1410303
Link To Document :
بازگشت