DocumentCode :
2525464
Title :
Script Identification of Document Image Analysis
Author :
Cheng, Uan ; Ping, Xijian ; Zhou, Guanwei ; Yang, Yang
Author_Institution :
Zhengzhou Inf. Sci. & Technol. Inst.
Volume :
3
fYear :
2006
fDate :
Aug. 30 2006-Sept. 1 2006
Firstpage :
178
Lastpage :
181
Abstract :
Script identification prior to OCR is necessary in document image analysis. And each script has unique spatial distribution and visual attribute that make it possible to identify itself from other languages. The key technology of script identification algorithm is to abstract effective measure feature. By analyzing vision differences based on normalized histogram statistic, Chinese, Japanese, English and Russian are identified respectively from others. Therefore, automatic identification of four scripts is realized successfully
Keywords :
document image processing; optical character recognition; OCR; document image analysis; normalized histogram statistic; optical character recognition; script identification algorithm; vision difference analysis; Histograms; Image analysis; Image segmentation; Information science; Natural languages; Optical character recognition software; Shape; Statistical analysis; Statistical distributions; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Innovative Computing, Information and Control, 2006. ICICIC '06. First International Conference on
Conference_Location :
Beijing
Print_ISBN :
0-7695-2616-0
Type :
conf
DOI :
10.1109/ICICIC.2006.518
Filename :
1692145
Link To Document :
بازگشت