DocumentCode
2477457
Title
E-Mail Filtering Based on Analysis of Structural Features and Text Classification
Author
Li, Xiao ; Luo, Junyong ; Yin, Meijuan
Author_Institution
Inf. Sci. & Technol. Inst., Zhengzhou, China
fYear
2010
fDate
22-23 May 2010
Firstpage
1
Lastpage
4
Abstract
Concerning the requirement of e-mail filtering to improve the efficiency and accuracy in e-mail mining, topic detection, and many other specific applications, learnt from traditional spam filtering methods, an approach based on feature analysis and text classification is proposed. Utilizing some structural features which are very likely to identify an irrelevant e-mail, such as group sending, embedded pictures, and so on, feature analysis filtering makes up the disadvantage of spending too much in text classification. An idea of identifying the category of a group-sent mail by the presence of personal names is proposed and the method of e-mail filtering based on URL blacklist is improved. Considering the different contribution of subject and body text to the category, the algorithm of Naive Bayesian e-mail classification is improved. The experimental results show that the method is reasonable and effective.
Keywords
Bayes methods; data mining; e-mail filters; feature extraction; text analysis; URL; blacklist; e-mail filtering; e-mail mining; group-sent mail; naive Bayesian e-mail classification; personal names; spam filtering; structural features analysis; text classification; Bayesian methods; Data mining; Electronic mail; Information analysis; Information filtering; Information filters; Postal services; Support vector machines; Text categorization; Uniform resource locators;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems and Applications (ISA), 2010 2nd International Workshop on
Conference_Location
Wuhan
Print_ISBN
978-1-4244-5872-1
Electronic_ISBN
978-1-4244-5874-5
Type
conf
DOI
10.1109/IWISA.2010.5473242
Filename
5473242
Link To Document