• DocumentCode
    538437
  • Title

    Email classification using data reduction method

  • Author

    Islam, Rafiqul ; Xiang, Yang

  • Author_Institution
    Sch. of Inf. Technol., Deakin Univ., Burwood, VIC, Australia
  • fYear
    2010
  • fDate
    25-27 Aug. 2010
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Classifying user emails correctly from penetration of spam is an important research issue for anti-spam researchers. This paper has presented an effective and efficient email classification technique based on data filtering method. In our testing we have introduced an innovative filtering technique using instance selection method (ISM) to reduce the pointless data instances from training model and then classify the test data. The objective of ISM is to identify which instances (examples, patterns) in email corpora should be selected as representatives of the entire dataset, without significant loss of information. We have used WEKA interface in our integrated classification model and tested diverse classification algorithms. Our empirical studies show significant performance in terms of classification accuracy with reduction of false positive instances.
  • Keywords
    data reduction; information filtering; pattern classification; unsolicited e-mail; WEKA interface; data filtering method; data reduction method; instance selection method; spam; user emails classification; Accuracy; Classification algorithms; Electronic mail; Feature extraction; Filtering; Support vector machines; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications and Networking in China (CHINACOM), 2010 5th International ICST Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    973-963-9799-97-4
  • Type

    conf

  • Filename
    5684656