DocumentCode :
2195850
Title :
Classification of oriental and European scripts by using characteristic features
Author :
Ding, Jie ; Lam, Louisa ; Suen, Ching Y.
Author_Institution :
Centre for Pattern Recognition & Machine Intelligence, Concordia Univ., Montreal, Que., Canada
Volume :
2
fYear :
1997
fDate :
18-20 Aug 1997
Firstpage :
1023
Abstract :
Two types of techniques are usually adopted in language differentiation: token matching and statistical analysis. In this paper we present a method which uses a combined analysis of several discriminating statistical features for the differentiation between European and oriental language scripts. When applied to more than 23 languages, it has proved to be effective in classifying documents printed in these different scripts
Keywords :
character sets; document image processing; feature extraction; image classification; image matching; optical character recognition; statistical analysis; European script classification; OCR; characteristic features; document classification; language differentiation; oriental script classification; statistical analysis; statistical features; token matching; Character recognition; Filters; Image segmentation; Indexing; Machine intelligence; Natural languages; Pattern recognition; Performance analysis; Statistical analysis; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on
Conference_Location :
Ulm
Print_ISBN :
0-8186-7898-4
Type :
conf
DOI :
10.1109/ICDAR.1997.620664
Filename :
620664
Link To Document :
بازگشت