Title :
Comparison of stamp classification using SVM and random ferns
Author :
Petej, Pjero ; Gotovac, Sven
Author_Institution :
Fac. of Electr. Eng., Mech. Eng. & Naval Archit., Univ. of Split, Split, Croatia
Abstract :
In distributed software systems and processes that use large amounts of documents there is an essential need for data mining and document classification algorithms. These algorithms are aimed at optimizing the process, making it less error prone. In this paper we deal with the problem of document classification using two machine learning algorithms. Both algorithms use stamp images in documents to classify the document itself. The idea is to classify the document stamp and then, using known information about the stamp owner, search the rest of the document for relevant data. Our results are based on actual documents used in the process of debt collection and our training and test datasets are randomly picked from an existing database with over three million documents. The mentioned machine learning classification algorithms are implemented and compared in terms of classification accurateness, robustness and speed.
Keywords :
document image processing; image classification; learning (artificial intelligence); SVM; debt collection; document stamp classification; machine learning classification algorithms; random ferns; stamp images; Graphics processing units; Image recognition; Robustness; Supervised learning; Support vector machines; Testing; Training; SVM; classification; distributed; documents; ferns; random; software; stamps; system;
Conference_Titel :
Computers and Communications (ISCC), 2013 IEEE Symposium on
Conference_Location :
Split
DOI :
10.1109/ISCC.2013.6755055