• DocumentCode
    2873059
  • Title

    An efficient semantic VSM based email categorization method

  • Author

    Lu, Zhao ; Ding, Jianguo

  • Author_Institution
    Dept. of Comput. Sci. & Technol., East China Normal Univ., Shanghai, China
  • Volume
    11
  • fYear
    2010
  • fDate
    22-24 Oct. 2010
  • Abstract
    Email categorization is challenging due to its sparse and noisy feature space. To address this problem, a novel semantic Vector Space Model (sVSM) using WordNet is proposed in this paper. The basic idea of sVSM is to select related semantic features that will increase the global information, and use them to enrich the semantic feature of an email. The proposed categorization method based on sVSM creates the sementic feature of an email category by both extracting terms of training email and enriching these terms with their concept-chains in WordNet. Next, tf*iwf*iwf algorithm is used to adjust the weight of the semantic feature vector. Experimental evaluations show that the proposed categorization method categorizing emails better than other email categorization methods based on traditional VSM, Baysian and KNN. More experiments show the proposed categorization method yielding better accuracy for smaller training sets with highlighting the semantic feature during identifying an email category.
  • Keywords
    electronic mail; pattern classification; email categorization method; global information; semantic vector space model; Accuracy; Electronic mail; Feature extraction; Modeling; Semantics; Support vector machine classification; Training; Email Categorization; Semantic Vector; Vector Space Model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Application and System Modeling (ICCASM), 2010 International Conference on
  • Conference_Location
    Taiyuan
  • Print_ISBN
    978-1-4244-7235-2
  • Electronic_ISBN
    978-1-4244-7237-6
  • Type

    conf

  • DOI
    10.1109/ICCASM.2010.5623150
  • Filename
    5623150