DocumentCode :
568777
Title :
Bayesian spam filtering for Vietnamese emails
Author :
Lung, Vu Duc ; Vu, Truong Nguyen
Author_Institution :
Univ. of Inf. Technol.-Vietnam Nat. Univ., Ho Chi Minh City, Vietnam
Volume :
1
fYear :
2012
fDate :
12-14 June 2012
Firstpage :
190
Lastpage :
193
Abstract :
Spam filtering is seen as the considerable concern to the researchers, and there are some techniques and email filtering systems are implemented. They are, however, not so effective for Vietnamese language. Although methods for filtering English spam email can be still used in Vietnamese language, but Vietnamese has its own particular characteristic. The biggest difference is a signified word in Vietnamese usually a compound. When a compound used in spam email is separated into single words, it becomes words that are usually used in both spam and ham emails. This leads to the difficulty for the system to filter spam emails. The objective of this paper is to present a new model using the application of Naïve Bayesian algorithms to analyze the segmentation of Vietnamese language. The process of demonstration and evaluation of this model shows the feasibility of this technique in filtering Vietnamese email.
Keywords :
Bayes methods; information filtering; natural languages; unsolicited e-mail; Bayesian spam filtering; English spam email; Vietnamese email; Vietnamese language; email filtering system; naive Bayesian algorithm; Accuracy; Databases; Dictionaries; Economics; Unsolicited electronic mail;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer & Information Science (ICCIS), 2012 International Conference on
Conference_Location :
Kuala Lumpeu
Print_ISBN :
978-1-4673-1937-9
Type :
conf
DOI :
10.1109/ICCISci.2012.6297237
Filename :
6297237
Link To Document :
بازگشت