DocumentCode
568777
Title
Bayesian spam filtering for Vietnamese emails
Author
Lung, Vu Duc ; Vu, Truong Nguyen
Author_Institution
Univ. of Inf. Technol.-Vietnam Nat. Univ., Ho Chi Minh City, Vietnam
Volume
1
fYear
2012
fDate
12-14 June 2012
Firstpage
190
Lastpage
193
Abstract
Spam filtering is seen as the considerable concern to the researchers, and there are some techniques and email filtering systems are implemented. They are, however, not so effective for Vietnamese language. Although methods for filtering English spam email can be still used in Vietnamese language, but Vietnamese has its own particular characteristic. The biggest difference is a signified word in Vietnamese usually a compound. When a compound used in spam email is separated into single words, it becomes words that are usually used in both spam and ham emails. This leads to the difficulty for the system to filter spam emails. The objective of this paper is to present a new model using the application of Naïve Bayesian algorithms to analyze the segmentation of Vietnamese language. The process of demonstration and evaluation of this model shows the feasibility of this technique in filtering Vietnamese email.
Keywords
Bayes methods; information filtering; natural languages; unsolicited e-mail; Bayesian spam filtering; English spam email; Vietnamese email; Vietnamese language; email filtering system; naive Bayesian algorithm; Accuracy; Databases; Dictionaries; Economics; Unsolicited electronic mail;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer & Information Science (ICCIS), 2012 International Conference on
Conference_Location
Kuala Lumpeu
Print_ISBN
978-1-4673-1937-9
Type
conf
DOI
10.1109/ICCISci.2012.6297237
Filename
6297237
Link To Document