• DocumentCode
    468162
  • Title

    Spam Filtering Issue: FPD Research between False Positive and False Negative

  • Author

    Liu Zhen ; Zhou Ming-Tian

  • Author_Institution
    Univ. of Electron. Sci. & Technol. of China, Chengdu
  • Volume
    1
  • fYear
    2007
  • fDate
    24-27 Aug. 2007
  • Firstpage
    526
  • Lastpage
    534
  • Abstract
    According to the fact that false positive is more serious than false negative while doing spam filtering, novel email filter with feature of partial dependency (FPD) is asked urgently. This paper investigates the FPD between false positive and false negative comprehensively and proposes an advanced fitted logistic regression model for spam discrimination by introducing a coefficient function involved with the feature of partial dependency. From four aspects including the precision ratio, dimensionality selection feature, KL divergence distribution between RFP and RFN , and noise withstanding, the new model is proved to be of evident CPD with respect to evaluation tests on real Email testing sets.
  • Keywords
    regression analysis; security of data; unsolicited e-mail; KL divergence distribution; coefficient function; dimensionality selection feature; electronic mail filter; false negative; false positive; logistic regression model; partial dependency; precision ratio; spam filtering; Computer science; Electronic mail; Filtering; Filters; Logistics; Machine learning; Statistics; Testing; Text categorization; Unsolicited electronic mail;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery, 2007. FSKD 2007. Fourth International Conference on
  • Conference_Location
    Haikou
  • Print_ISBN
    978-0-7695-2874-8
  • Type

    conf

  • DOI
    10.1109/FSKD.2007.523
  • Filename
    4405981