Title :
Spam Filtering based on Character Field
Author :
Liu Hui ; Zhang Cai-ming
Author_Institution :
Shandong Econ. Univ., Jinan
Abstract :
A novel method for SPAM filtering is proposed based on the information of character fields and the term frequency in the character fields. The technique used in this method is discussed, which includes selecting the characters of e-mail documents, constructing the character term lexicons via calculation of weights about term frequency (TF). In addition, an improved probabilistic model of computation for text similarity is provided. Experiments show that the new method works better than traditional Rocchio method in terms of recall, precision and some other evaluation targets.
Keywords :
unsolicited e-mail; e-mail documents; spam filtering; term frequency; text similarity; Classification algorithms; Computer science; Electronic mail; Frequency; Information filtering; Information filters; Machine learning; Machine learning algorithms; Support vector machines; Unsolicited electronic mail;
Conference_Titel :
Innovative Computing, Information and Control, 2007. ICICIC '07. Second International Conference on
Conference_Location :
Kumamoto
Print_ISBN :
0-7695-2882-1
DOI :
10.1109/ICICIC.2007.531