Title :
Recognizing Vietnamese Online Handwritten Separated Characters
Author :
Duy Khuong Nguyen ; The Duy Bui
Author_Institution :
Coll. of Technol., Vietnam Nat. Univ., Hanoi
Abstract :
Vietnamese alphabet is based on the Latin alphabet with the addition of nine accent marks or diacritics - four of them to create additional sounds, and the other five to indicate the tone of each word. Because Vietnamese is a tonal language that uses tone to distinguish words, recognizing diacritics is an important part in recognizing Vietnamese word. However, in written form, diacritics are much smaller then the characters, which make very them hard to recognize. Previous works on Vietnamese characters recognition often pre-process input with a graph-based approach by trying to separate the main characters with their diacritics by determining connected regions at pixel level. This approach, however, only works well where the input contains only characters with separable diacritics, for example, scanned image of printed documents. We propose in this paper a robust method to recognize online Vietnamese characters with diacritics. Using cosine transformation with appropriated sampling algorithms, we represent multiple strokes of a character together in a single set of features. This set of features is then used as the input for a well designed machine learning based system. We have tested our system on the combination of Vietnamese characters with diacritics and Section 1c (isolated characters) of the Unipen data set, and have obtained very competitive results.
Keywords :
discrete cosine transforms; graph theory; handwritten character recognition; natural language processing; word processing; Latin alphabet; Vietnamese alphabet; accent marks; character recognition; cosine transformation; diacritics; online handwritten separated characters; Character recognition; Educational institutions; Handwriting recognition; Information technology; Keyboards; Learning systems; Machine learning algorithms; Natural languages; Pixel; Robustness; Vietnamese online handwriting recognition;
Conference_Titel :
Advanced Language Processing and Web Information Technology, 2008. ALPIT '08. International Conference on
Conference_Location :
Dalian Liaoning
Print_ISBN :
978-0-7695-3273-8
DOI :
10.1109/ALPIT.2008.58