• DocumentCode
    568777
  • Title

    Bayesian spam filtering for Vietnamese emails

  • Author

    Lung, Vu Duc ; Vu, Truong Nguyen

  • Author_Institution
    Univ. of Inf. Technol.-Vietnam Nat. Univ., Ho Chi Minh City, Vietnam
  • Volume
    1
  • fYear
    2012
  • fDate
    12-14 June 2012
  • Firstpage
    190
  • Lastpage
    193
  • Abstract
    Spam filtering is seen as the considerable concern to the researchers, and there are some techniques and email filtering systems are implemented. They are, however, not so effective for Vietnamese language. Although methods for filtering English spam email can be still used in Vietnamese language, but Vietnamese has its own particular characteristic. The biggest difference is a signified word in Vietnamese usually a compound. When a compound used in spam email is separated into single words, it becomes words that are usually used in both spam and ham emails. This leads to the difficulty for the system to filter spam emails. The objective of this paper is to present a new model using the application of Naïve Bayesian algorithms to analyze the segmentation of Vietnamese language. The process of demonstration and evaluation of this model shows the feasibility of this technique in filtering Vietnamese email.
  • Keywords
    Bayes methods; information filtering; natural languages; unsolicited e-mail; Bayesian spam filtering; English spam email; Vietnamese email; Vietnamese language; email filtering system; naive Bayesian algorithm; Accuracy; Databases; Dictionaries; Economics; Unsolicited electronic mail;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer & Information Science (ICCIS), 2012 International Conference on
  • Conference_Location
    Kuala Lumpeu
  • Print_ISBN
    978-1-4673-1937-9
  • Type

    conf

  • DOI
    10.1109/ICCISci.2012.6297237
  • Filename
    6297237