• Title of article

    Filtering Spam E-Mail from Mixed Arabic andEnglish Messages: A Comparison of MachineLearning Techniques

  • Author/Authors

    Alaa El-Halees، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2009
  • Pages
    8
  • From page
    52
  • To page
    59
  • Abstract
    Spam is one of the main problems in emails communications. As the volume of non-english language spamincreases, little work is done in this area. For example, in Arab world users receive spam written mostly in arabic, english ormixed Arabic and english. To filter this kind of messages, this research applied several machine learning techniques. Manyresearchers have used machine learning techniques to filter spam email messages. This study compared six supervisedmachine learning classifiers which are maximum entropy, decision trees, artificial neural nets, naive bayes, support systemmachines and k-nearest neighbor. The experiments suggested that words in Arabic messages should be stemmed beforeapplying classifier. In addition, in most cases, experiments showed that classifiers using feature selection techniques canachieve comparable or better performance than filters do not used them
  • Keywords
    Anti-spam filtering , Machine learning techniques , text data mining
  • Journal title
    The International Arab Journal of Information Technology (IAJIT)
  • Serial Year
    2009
  • Journal title
    The International Arab Journal of Information Technology (IAJIT)
  • Record number

    668755