Title :
A comparison of morpheme and word based document retrieval for Asian languages
Author :
Nguyen, Van Be Hai ; Vines, Phil ; Wilkinson, Ross
Author_Institution :
Dept. of Comput. Sci., R. Melbourne Inst. of Technol., Vic., Australia
Abstract :
Most document retrieval systems are word based. Words are very convenient retrieval units in English but not so in some Asian languages. The task of determining which morphemes constitute words in Vietnamese and Chinese is problematic, and has been assumed to be the reason that word based retrieval does not work so well. The paper examines a number of segmentation algorithms, and then reports on some experiments comparing morpheme and word based retrieval. It shows that morpheme based retrieval is hard to improve on
Keywords :
information retrieval system evaluation; natural languages; query processing; Asian languages; Chinese language; Vietnamese language; morpheme based document retrieval; segmentation algorithms; word based document retrieval; Computer science; Data mining; Frequency; Indexing; Information retrieval; Natural languages; Robustness;
Conference_Titel :
Database and Expert Systems Applications, 1996. Proceedings., Seventh International Workshop on
Conference_Location :
Zurich
Print_ISBN :
0-8186-7662-0
DOI :
10.1109/DEXA.1996.558329