DocumentCode
2398505
Title
Retrieval of degraded Chinese document based on fuzzy coding strategy
Author
Xia Yong ; Jia Xu-Hui ; Wang Kuan-Quan
Author_Institution
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
fYear
2012
fDate
19-20 May 2012
Firstpage
261
Lastpage
264
Abstract
For the sake of the low recognition rate for degraded Chinese document, the performance of retrieval is not good if directly based on OCR result. This paper presents a new way to improve the performance of retrieval by fuzzy coding strategy. Lots of character classes with similar shapes are clustered and are indexed by pseudo code. For ease of test, this paper also presents a way to generate ground-truth of imaged document and synthesized degraded document image. A true OCR text collection and two synthesized document image collections are used for performance evaluation, and the result confirms the validation of our method.
Keywords
document image processing; fuzzy set theory; image coding; image retrieval; optical character recognition; OCR text collection; degraded Chinese document retrieval; fuzzy coding strategy; ground-truth generate; imaged document; pseudo code; synthesized degraded document image; Degradation; Encoding; Image retrieval; Indexing; Optical character recognition software; Performance evaluation; Text analysis; Retrieval of degraded Chinese document; Synthesis of degraded document; fuzzy coding strategy;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems and Informatics (ICSAI), 2012 International Conference on
Conference_Location
Yantai
Print_ISBN
978-1-4673-0198-5
Type
conf
DOI
10.1109/ICSAI.2012.6223602
Filename
6223602
Link To Document