Title :
Spam Filtering Issue: FPD Research between False Positive and False Negative
Author :
Liu Zhen ; Zhou Ming-Tian
Author_Institution :
Univ. of Electron. Sci. & Technol. of China, Chengdu
Abstract :
According to the fact that false positive is more serious than false negative while doing spam filtering, novel email filter with feature of partial dependency (FPD) is asked urgently. This paper investigates the FPD between false positive and false negative comprehensively and proposes an advanced fitted logistic regression model for spam discrimination by introducing a coefficient function involved with the feature of partial dependency. From four aspects including the precision ratio, dimensionality selection feature, KL divergence distribution between RFP and RFN , and noise withstanding, the new model is proved to be of evident CPD with respect to evaluation tests on real Email testing sets.
Keywords :
regression analysis; security of data; unsolicited e-mail; KL divergence distribution; coefficient function; dimensionality selection feature; electronic mail filter; false negative; false positive; logistic regression model; partial dependency; precision ratio; spam filtering; Computer science; Electronic mail; Filtering; Filters; Logistics; Machine learning; Statistics; Testing; Text categorization; Unsolicited electronic mail;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
Conference_Location :
Haikou
Print_ISBN :
978-0-7695-2874-8
DOI :
10.1109/FSKD.2007.523