Title :
Geometric method for document understanding and classification using online machine learning
Author :
Nattee, Cholwich ; Numao, Masayuki
Author_Institution :
Dept. of Comput. Sci., Tokyo Inst. of Technol., Japan
fDate :
6/23/1905 12:00:00 AM
Abstract :
We propose a geometric method for document image processing. This research focuses on document understanding and classification by applying the Winnow algorithm, an online machine learning method. This application makes the document image processing more flexible with various kind of documents since the meaningful knowledge can be extracted from training examples and the model for document type can be updated when there is a new example. This research aims to analyze and classify scientific papers. We conduct the experiments on documents from the proceedings of various conferences to show the performance of the proposed method. The experimental results are compared with the WISDOM++ system and also show the advantages of using the online machine learning method
Keywords :
document image processing; image retrieval; learning systems; pattern classification; real-time systems; Winnow algorithm; document image processing; document understanding; geometric method; image retrieval; machine learning; pattern classification; real time systems; scientific papers; Application software; Computer science; Document image processing; Electronic mail; Information analysis; Learning systems; Logic; Machine learning; Machine learning algorithms; Text analysis;
Conference_Titel :
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7695-1263-1
DOI :
10.1109/ICDAR.2001.953860