• DocumentCode
    3242554
  • Title

    Automatic Discrimination between Printed and Handwritten Text in Documents

  • Author

    da Silva, Leonardo F. ; Conci, Aura ; Sanchez, Angel

  • Author_Institution
    Inst. de Comput., Univ. Fed. Fluminense - UFF, Niteroi, Brazil
  • fYear
    2009
  • fDate
    11-15 Oct. 2009
  • Firstpage
    261
  • Lastpage
    267
  • Abstract
    Recognition techniques for printed and handwritten text in scanned documents are significantly different. In this paper we address the problem of identifying each type. We can list at least four steps: digitalization, preprocessing, feature extraction and decision or classification. A new aspect of our approach is the use of data mining techniques on the decision step. A new set of features extracted of each word is proposed as well. Classification rules are mining and used to discern printed text from handwritten. The proposed system was tested in two public image databases. All possible measures of efficiency were computed achieving on every occasion quantities above 80%.
  • Keywords
    data mining; document image processing; feature extraction; handwritten character recognition; image classification; image segmentation; optical character recognition; text analysis; classification rule mining; data mining; document automatic text discrimination; feature extraction; handwritten text; image classification; printed text; public image databases; scanned documents; text recognition; Character recognition; Classification tree analysis; Computer graphics; Data mining; Feature extraction; Hidden Markov models; Image databases; Image processing; Image segmentation; Optical character recognition software; Data Mining; Machine Vision; document analysis; optical characters recognition; text identification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Graphics and Image Processing (SIBGRAPI), 2009 XXII Brazilian Symposium on
  • Conference_Location
    Rio de Janiero
  • ISSN
    1550-1834
  • Print_ISBN
    978-1-4244-4978-1
  • Electronic_ISBN
    1550-1834
  • Type

    conf

  • DOI
    10.1109/SIBGRAPI.2009.40
  • Filename
    5395199