DocumentCode
596344
Title
A clustering strategy for touching characters in Korean and English printed text segmentation
Author
Wahyono ; Kang-Hyun Jo
Author_Institution
Dept. of Electr. Eng., Univ. of Ulsan, Ulsan, South Korea
fYear
2012
fDate
26-28 Nov. 2012
Firstpage
23
Lastpage
25
Abstract
This paper proposes segmentation method in mixed Korean and English printed text which contains touching characters using clustering strategy. At the first step, a vertical projection of image text is determined, and clustering process performed on it. Then the cluster with the smallest mean value used as candidate segmentation point. This process will produce candidate bounding boxes. Furthermore, they should be verified whether according to Korean or English characteristics otherwise they will be splitted or merged each others. The merged process could be done based on Korean vowel characteristics since Korean alphabet consist several symbols, while splitted process could be done by local vertical projection clustering. The proposed method gives 99.36% correct segmentation rate in un-touching characters and 99.25% in touching characters. This result shows that the proposed method using clustering strategy is very effective for touching problem in mixed Korean and English printed text. Besides, it also improves the speed of segmentation process, because the method does not need a character recognizer to verify bounding boxes.
Keywords
character recognition; image segmentation; natural language processing; pattern clustering; robot vision; text analysis; English printed text segmentation method; Korean alphabet; Korean printed text segmentation method; Korean vowel characteristics; candidate bounding boxes; candidate segmentation point; local vertical projection clustering strategy; segmentation process; segmentation rate; touching characters; vertical image text projection; Ambient intelligence; Character recognition; Clustering algorithms; Image segmentation; Robots; Text recognition; Writing; Character Recognition; Clustering; Segmentation; Touching Character;
fLanguage
English
Publisher
ieee
Conference_Titel
Ubiquitous Robots and Ambient Intelligence (URAI), 2012 9th International Conference on
Conference_Location
Daejeon
Print_ISBN
978-1-4673-3111-1
Electronic_ISBN
978-1-4673-3110-4
Type
conf
DOI
10.1109/URAI.2012.6462921
Filename
6462921
Link To Document