Title :
A Hierarchical Classification Model for Document Categorization
Author :
Xu, Jian-Wu ; Singh, Vartika ; Govindaraju, Venu ; Neogi, Depankar
Author_Institution :
Copanion Inc., Andover, MA, USA
Abstract :
We propose a novel hierarchical classification method for documents categorization in this paper. The approach consists of multiple levels of classification for different hierarchies. Regularized Least Square (RLS)binary classifiers are applied in the middle levels of the hierarchy to classify documents into smaller set of categories and K-nearest-neighbor (KNN) multi-class classifiers are used at the bottom to classify documents into final classes. Experiments on large-scale real world tax documents show that the proposed hierarchical approach outperforms traditional flat classification method.
Keywords :
document image processing; image classification; least squares approximations; document categorization; hierarchical classification model; k-nearest-neighbor multiclass classifier; regularized least square binary classifier; Large-scale systems; Least squares methods; Classifier Combination; Document Classification; Hierarchical Classification; Multiple-classifiers;
Conference_Titel :
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4244-4500-4
Electronic_ISBN :
1520-5363
DOI :
10.1109/ICDAR.2009.187