DocumentCode :
2216361
Title :
Recognizing broken characters in Thai Historical documents
Author :
Sumetphong, Chaivatna ; Tangwongsan, Supachai
Author_Institution :
Fac. of Inf. & Commun. Technol., Mahidol Univ., Bangkok, Thailand
Volume :
1
fYear :
2010
fDate :
20-22 Aug. 2010
Abstract :
One of the biggest challenges in restoring historical documents is to achieve a high level of OCR accuracy. The main characteristic inherent to these valuable but degraded documents is the abundant presence of broken characters. This paper represents this problem as a mathematical model. We also propose a novel solution based on set-partitions to recognize broken characters in Thai Historical documents. Experiments based on this solution have been performed and the results are very promising.
Keywords :
character recognition; document image processing; image restoration; mathematical analysis; natural language processing; OCR accuracy; Thai historical document; broken character recognition; degraded document; historical document restoration; mathematical model; set-partition; Character recognition; Broken Characters; Error Correction; Optical Character Recognition; Set-Partitions; Thai Historical Documents;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Computer Theory and Engineering (ICACTE), 2010 3rd International Conference on
Conference_Location :
Chengdu
ISSN :
2154-7491
Print_ISBN :
978-1-4244-6539-2
Type :
conf
DOI :
10.1109/ICACTE.2010.5579053
Filename :
5579053
Link To Document :
بازگشت