• DocumentCode
    3278167
  • Title

    Thai word segmentation for visualization of Thai Web sites

  • Author

    Thanadechteemapat, Wigrai ; Fung, Chun-che

  • Author_Institution
    Sch. of Inf. Technol., Murdoch Univ., Murdoch, WA, Australia
  • Volume
    4
  • fYear
    2011
  • fDate
    10-13 July 2011
  • Firstpage
    1544
  • Lastpage
    1549
  • Abstract
    Information overload is a problem in the Information Age and Information visualization is an approach to provide an overview of the content of a web site. Tag cloud is one of the ways to represent information as an image of a group of words. However, there are limitations on tag cloud generation, and one of them is due to the characteristics for the language. In order to extract tags or words for tag cloud, word segmentation is required. This paper proposes a Thai word segmentation approach for the visualization of Thai Web sites. The proposed Thai word segmentation technique is based on the longest matching technique together with a refined corpus. The results of Thai word segmentation are compatible with the results from previous BEST´s contests in Thailand.
  • Keywords
    Web sites; data visualisation; feature extraction; image segmentation; natural language processing; word processing; Thai Web site visualization; Thai word segmentation; Thailand; information overload; information visualization; matching technique; tag cloud generation; tag extraction; word extraction; Compounds; Data visualization; Internet; Tag clouds; Visualization; Web pages; Tag cloud; Thai Word Segmentation; Web Page Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2011 International Conference on
  • Conference_Location
    Guilin
  • ISSN
    2160-133X
  • Print_ISBN
    978-1-4577-0305-8
  • Type

    conf

  • DOI
    10.1109/ICMLC.2011.6016978
  • Filename
    6016978