• DocumentCode
    3099320
  • Title

    Theses cluster based on bilingual and synonymous keyword sets using mutual information

  • Author

    Huang, Chung-yi ; Chen, Rung-Ching

  • Author_Institution
    Dept. of Inf. Manage., Chaoyang Univ. of Technol., Wufong, Taiwan
  • Volume
    5
  • fYear
    2009
  • fDate
    12-15 July 2009
  • Firstpage
    2999
  • Lastpage
    3004
  • Abstract
    Searching published papers is a required activity for the researching process. Since articles are presented in various languages, it makes precise queries hard to achieve. In this paper, we propose an automatic theses clustering method based on bilingual and synonymous keyword sets which includes Chinese and English keywords. We also provide a clustering computation to speedup operation. First, the system automatically generates bilingual and synonymous keyword sets, and then based on bilingual and synonymous keyword sets, clustering the theses. The method not only solves the weakness of using digital dictionaries to solve clustering problems, but also makes error problem, the query by bilingual and synonymous keywords, be restricted. The system was implemented by a clustering computation technology to solve traditional documents clustering systems performance problems. Through many computer processes, the system not only can save a lot of time, but also can attain high availability and load balancing effectiveness. Primary experiments prove that the system makes the theses clustering work effectively.
  • Keywords
    data mining; dictionaries; text analysis; word processing; automatic theses clustering method; bilingual keyword; digital dictionary; error problem; mutual information; synonymous keyword sets; Classification tree analysis; Cybernetics; Databases; Dictionaries; Frequency; Machine learning; Mutual information; Natural languages; Wireless LAN; Wireless networks; Bilingual and synonymous keyword; Document clustering; Keyword set;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2009 International Conference on
  • Conference_Location
    Baoding
  • Print_ISBN
    978-1-4244-3702-3
  • Electronic_ISBN
    978-1-4244-3703-0
  • Type

    conf

  • DOI
    10.1109/ICMLC.2009.5212598
  • Filename
    5212598