DocumentCode
2337360
Title
A comparison of collation algorithm for Myanmar language
Author
Yuzana ; Tun, Khin Marlar
Author_Institution
Univ. of Comput. Studies, Yangon
fYear
2008
fDate
13-16 Nov. 2008
Firstpage
538
Lastpage
543
Abstract
Myanmar language has no white spaces and word boundary. There is lack of support in Unicode database application such as collation and searching. Powerful collation strategy has necessitated to the all embracing research in the locality of natural language processing. Consequently, we propose a new collation algorithm MyCollate2 extend from MyCollate1 for Myanmar language. This collation algorithm is based on heuristics chart or table. This method foremost slices the syllables of names and then collates them according to the traditional standard Myanmar language dictionary book order. Propose new heuristics chart can work well not only for syllable segmentation but also for collation of words. This algorithm can collate Myanmar names as well as Myanmar words with complex syllable structure such as Pali, Pali loan styles, subscript styles and kinzi styles. This paper tested with Myanmar name, Pali words from Damma books and dictionary words from dictionary book. The experimental result shows that syllable slicing accuracy get 99.55% compare with others and show slicing performance. Collation accuracy gets 95.88% and is better accuracy than previous collation algorithm MyCollate1.
Keywords
dictionaries; natural language processing; Damma books; MyCollate1; MyCollate2; Myanmar language dictionary; Myanmar name; Pali loan styles; Pali words; collation algorithm; collation strategy; dictionary words; heuristics chart; heuristics table; kinzi styles; natural language processing; subscript styles; syllable segmentation; syllable slicing; unicode database application; Books; Clustering algorithms; Databases; Dictionaries; Information retrieval; Libraries; Natural language processing; Natural languages; Sorting; White spaces;
fLanguage
English
Publisher
ieee
Conference_Titel
Digital Information Management, 2008. ICDIM 2008. Third International Conference on
Conference_Location
London
Print_ISBN
978-1-4244-2916-5
Electronic_ISBN
978-1-4244-2917-2
Type
conf
DOI
10.1109/ICDIM.2008.4746740
Filename
4746740
Link To Document