Title :
A clustering-based approach to the separation of text strings from mixed text/graphics documents
Author :
He, Shoujie ; Abe, Norihiro
Author_Institution :
Dept. of Inf. Syst. & Comput. Sci., Nat. Univ. of Singapore, Singapore
Abstract :
A clustering-based approach to the separation of text from mixed text/graphics documents is presented. The approach starts from the grouping of connected components. Clustering is employed at three critical stages to improve the efficiency and effectiveness of the grouping, i.e., prior to the grouping, prior to orientation estimation, and posterior to the orientation estimation. Because of the high accuracy of the estimated orientation, not only the overgrouping but also most of undergrouping cases could be successfully handled
Keywords :
document image processing; image recognition; clustering-based approach; mixed text/graphics documents; orientation estimation; text string separation; Computer graphics; Computer science; Data mining; Helium; Information systems; Maximum likelihood estimation; Smoothing methods; Systems engineering and theory; Testing; Tree data structures;
Conference_Titel :
Pattern Recognition, 1996., Proceedings of the 13th International Conference on
Conference_Location :
Vienna
Print_ISBN :
0-8186-7282-X
DOI :
10.1109/ICPR.1996.547037