DocumentCode
183254
Title
Novel Handwritten Words and Documents Databases of Five Middle Eastern Languages
Author
Nobile, Nicola ; Khayyat, Muna ; Lam, Linh ; Suen, Ching
Author_Institution
Centre for Pattern Recognition & Machine Intell., Concordia Univ., Montreal, QC, Canada
fYear
2014
fDate
1-4 Sept. 2014
Firstpage
152
Lastpage
157
Abstract
This paper introduces new handwritten databases of selected words in the five Middle-Eastern languages of Arabic, Dari, Farsi, Pashto and Urdu. The databases share a common lexicon of forty words that are related to finance and are used in daily life. The five databases have been collected from over 1600 native writers located in four countries. Recognition results for each of the databases are also presented. Results come from three classifiers (Support Vector Machines, Modified Quadratic Discriminant Function. And Multi-layer Perceptron) which were implemented for recognition of the words based on gradient features. Given the diversity of the data, the results demonstrate the effectiveness of the implemented process in learning and recognizing samples of handwritten words from different languages. In addition, full page handwritten documents of each language are presented, with approximately forty pages per language. Each document has associated ground truth information.
Keywords
database management systems; handwriting recognition; natural language processing; Arabic; Dari; Farsi; Middle Eastern languages; Pashto; Urdu; documents databases; gradient features; novel handwritten words databases; Character recognition; Databases; Feature extraction; Handwriting recognition; Support vector machines; Testing; Training; database; handwritten documents; isolated words; line extraction; recognition; word spotting;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on
Conference_Location
Heraklion
ISSN
2167-6445
Print_ISBN
978-1-4799-4335-7
Type
conf
DOI
10.1109/ICFHR.2014.33
Filename
6981012
Link To Document