Title : 
Underline removal method by utilizing characteristics of Japanese business documents
         
        
            Author : 
Oba, Mitsuharu ; Nozaki, Yasuyuki ; Matsumoto, Toshiko ; Onoyama, Takashi
         
        
            Author_Institution : 
R&D Dept., Hitachi Software Eng. Co., Ltd., Tokyo, Japan
         
        
        
        
        
        
            Abstract : 
In this paper we propose an underline removal method specific to Japanese business document. Automated removal of underlines is important, because underline is the main cause of OCR misrecognition. The main feature of our method is to remove various types of underlines such as touched, inclined, and blurred lines by line template matching. Moreover, our method makes it possible to remove all possible underlines by excluding table ruled lines which are necessary for document structure analysis. The experimental result demonstrates that the proposed method is able to improve OCR recognition accuracy.
         
        
            Keywords : 
document image processing; image matching; optical character recognition; Japanese business documents; OCR misrecognition; document structure analysis; line template matching; optical character recognition software; table ruled lines; underline removal method; Character recognition; Companies; Content management; Data mining; Electrochemical machining; Optical character recognition software; Production; Software engineering; Technology management; Text analysis; Line Template Matching; OCR; business documents; underline removal;
         
        
        
        
            Conference_Titel : 
TENCON 2009 - 2009 IEEE Region 10 Conference
         
        
            Conference_Location : 
Singapore
         
        
            Print_ISBN : 
978-1-4244-4546-2
         
        
            Electronic_ISBN : 
978-1-4244-4547-9
         
        
        
            DOI : 
10.1109/TENCON.2009.5396199