DocumentCode :
2864308
Title :
eMailSift: eMail classification based on structure and content
Author :
Aery, Manu ; Chakravarthy, Sharma
Author_Institution :
Dept. of IT Lab. & CSE, Texas Univ., Arlington, TX, USA
fYear :
2005
fDate :
27-30 Nov. 2005
Abstract :
In this paper we propose a novel approach that uses structure as well as the content of emails in a folder for email classification. Our approach is based on the premise that representative - common and recurring -structures/patterns can be extracted from a pre-classified email folder and the same can be used effectively for classifying incoming emails. A number of factors that influence representative structure extraction and the classification are analyzed conceptually and validated experimentally. In our approach, the notion of inexact graph match is leveraged for deriving structures that provide coverage for characterizing folder contents. Extensive experimentation validate the selection of parameters and the effectiveness of our approach for email classification.
Keywords :
electronic mail; pattern classification; eMail classification; eMail content; eMail structure; eMailSift; inexact graph match; structure extraction; Data mining; Degradation; Electronic mail; Internet; Laboratories; Text categorization; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, Fifth IEEE International Conference on
ISSN :
1550-4786
Print_ISBN :
0-7695-2278-5
Type :
conf
DOI :
10.1109/ICDM.2005.58
Filename :
1565657
Link To Document :
بازگشت