DocumentCode
3278167
Title
Thai word segmentation for visualization of Thai Web sites
Author
Thanadechteemapat, Wigrai ; Fung, Chun-che
Author_Institution
Sch. of Inf. Technol., Murdoch Univ., Murdoch, WA, Australia
Volume
4
fYear
2011
fDate
10-13 July 2011
Firstpage
1544
Lastpage
1549
Abstract
Information overload is a problem in the Information Age and Information visualization is an approach to provide an overview of the content of a web site. Tag cloud is one of the ways to represent information as an image of a group of words. However, there are limitations on tag cloud generation, and one of them is due to the characteristics for the language. In order to extract tags or words for tag cloud, word segmentation is required. This paper proposes a Thai word segmentation approach for the visualization of Thai Web sites. The proposed Thai word segmentation technique is based on the longest matching technique together with a refined corpus. The results of Thai word segmentation are compatible with the results from previous BEST´s contests in Thailand.
Keywords
Web sites; data visualisation; feature extraction; image segmentation; natural language processing; word processing; Thai Web site visualization; Thai word segmentation; Thailand; information overload; information visualization; matching technique; tag cloud generation; tag extraction; word extraction; Compounds; Data visualization; Internet; Tag clouds; Visualization; Web pages; Tag cloud; Thai Word Segmentation; Web Page Visualization;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics (ICMLC), 2011 International Conference on
Conference_Location
Guilin
ISSN
2160-133X
Print_ISBN
978-1-4577-0305-8
Type
conf
DOI
10.1109/ICMLC.2011.6016978
Filename
6016978
Link To Document