Title : 
A model guided document image analysis scheme
         
        
            Author : 
Harit, Gaurav ; Chaudhury, Santanu ; Gupta, P. ; Vohra, N. ; Joshi, S.D.
         
        
            Author_Institution : 
Dept. of Electr. Eng., Indian Inst. of Technol., New Delhi, India
         
        
        
            fDate : 
6/23/1905 12:00:00 AM
         
        
        
        
            Abstract : 
This paper presents a new model-based document image segmentation scheme that uses XML-DTDs (eXtensible Markup Language Document Type Definitions). Given a document image, the algorithm has the ability to select the appropriate model. A new wavelet-based tool has been designed for distinguishing text from non-text regions and characterization of font sizes. Our model-based analysis scheme makes use of this tool for identifying the logical components of a document image
         
        
            Keywords : 
document image processing; hypermedia markup languages; image segmentation; wavelet transforms; XML document type definition; document layout analysis; font size characterization; logical components identification; model-based document image segmentation scheme; model-guided document image analysis scheme; nontext regions; text regions; wavelet-based tool; Geometry; Graphics; Image analysis; Image segmentation; Layout; Shape; Text analysis;
         
        
        
        
            Conference_Titel : 
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
         
        
            Conference_Location : 
Seattle, WA
         
        
            Print_ISBN : 
0-7695-1263-1
         
        
        
            DOI : 
10.1109/ICDAR.2001.953963