Title :
The Research of Text Categorization Based on FP-Tree
Author_Institution :
Sch. of Inf. Manage., ShanDong Economic Universify, Ji´´Nan, China
Abstract :
Decision tree is a forecasting model with a tree shape. The way from the root node to the leaf node forms the rule that forecasts a class label to the object. But it is often used in the condition of having not too much attributes. And FP-tree is a fast and efficient structure to discover frequent pattern. The paper proposes a new fast method for categorization through FP-tree. It discovers frequent feature terms of documents and categories using FP-tree and then forms a class decision tree by using the top frequent feature term to be the test attributes. So the decision tree is used in text categorization by the frequent feature terms. Finally the paper gives the experiment and the analysis of the method.
Keywords :
data mining; decision trees; forecasting theory; text analysis; FP-tree; class decision tree; class label; forecasting model; frequent feature terms; frequent patterns; text categorization; Association rules; Data mining; Decision trees; Economic forecasting; Management information systems; Mutual information; Pattern recognition; Predictive models; Testing; Text categorization; decision tree; fp-tree; mutual information;
Conference_Titel :
Web Information Systems and Mining, 2009. WISM 2009. International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3817-4
DOI :
10.1109/WISM.2009.43