DocumentCode :
3278167
Title :
Thai word segmentation for visualization of Thai Web sites
Author :
Thanadechteemapat, Wigrai ; Fung, Chun-che
Author_Institution :
Sch. of Inf. Technol., Murdoch Univ., Murdoch, WA, Australia
Volume :
4
fYear :
2011
fDate :
10-13 July 2011
Firstpage :
1544
Lastpage :
1549
Abstract :
Information overload is a problem in the Information Age and Information visualization is an approach to provide an overview of the content of a web site. Tag cloud is one of the ways to represent information as an image of a group of words. However, there are limitations on tag cloud generation, and one of them is due to the characteristics for the language. In order to extract tags or words for tag cloud, word segmentation is required. This paper proposes a Thai word segmentation approach for the visualization of Thai Web sites. The proposed Thai word segmentation technique is based on the longest matching technique together with a refined corpus. The results of Thai word segmentation are compatible with the results from previous BEST´s contests in Thailand.
Keywords :
Web sites; data visualisation; feature extraction; image segmentation; natural language processing; word processing; Thai Web site visualization; Thai word segmentation; Thailand; information overload; information visualization; matching technique; tag cloud generation; tag extraction; word extraction; Compounds; Data visualization; Internet; Tag clouds; Visualization; Web pages; Tag cloud; Thai Word Segmentation; Web Page Visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2011 International Conference on
Conference_Location :
Guilin
ISSN :
2160-133X
Print_ISBN :
978-1-4577-0305-8
Type :
conf
DOI :
10.1109/ICMLC.2011.6016978
Filename :
6016978
Link To Document :
بازگشت