DocumentCode :
3340194
Title :
Categorization of On-Line Handwritten Documents
Author :
Saldarriaga, Sebastián Peña ; Morin, Emmanuel ; Viard-gaudin, Christian
Author_Institution :
Univ. de Nantes, Nantes
fYear :
2008
fDate :
16-19 Sept. 2008
Firstpage :
95
Lastpage :
102
Abstract :
With the growth of on-line handwriting technologies, managing facilities for handwritten documents, such as retrieval of documents by topic, are required. These documents can contain graphics, equations or text for instance. This work reports experiments on categorization of on-line handwritten documents based on their textual contents. We assume that handwritten text blocks have been extracted from the documents, and as a first step of the proposed system, we process them with an existing handwritten recognition engine. We analyse the effect of the word recognition rate on the categorization performances, and we compare them with those obtained with the same texts available as ground truth. Two categorization algorithms (kNN and SVM) are compared in this work. The handwritten texts are a subset of the Reuters-21578 corpus collected from more than 1500 writers. Results show that there is no significant categorization performance loss when the word error rate stands below 22%.
Keywords :
handwritten character recognition; text analysis; Reuters-21578 corpus; handwritten recognition engine; online handwriting technologies; online handwritten document categorization; Engines; Graphics; Handwriting recognition; Optical character recognition software; Personal digital assistants; Technology management; Text analysis; Text categorization; Text recognition; Writing; Noisy Text; On-line Documents; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis Systems, 2008. DAS '08. The Eighth IAPR International Workshop on
Conference_Location :
Nara
Print_ISBN :
978-0-7695-3337-7
Type :
conf
DOI :
10.1109/DAS.2008.45
Filename :
4669950
Link To Document :
بازگشت