DocumentCode :
2312786
Title :
A Survey on Text Classification Techniques for E-mail Filtering
Author :
Upasana ; Chakravarty, S.
Author_Institution :
Div. of Comput. Eng., Netaji Subhas Inst. of Technol., New Delhi, India
fYear :
2010
fDate :
9-11 Feb. 2010
Firstpage :
32
Lastpage :
36
Abstract :
The continuing explosive growth of textual content within the World Wide Web has given rise to the need for sophisticated Text Classification (TC) techniques that combine efficiency with high quality of results. E-mail filtering is one application that has the potential to affect every user of the internet. Even though a large body of research has delved into this problem, there is a paucity of survey that indicates trends and directions. This paper attempts to categorize the prevalent popular techniques for classifying email as spam or legitimate and suggest possible techniques to fill in the lacunae. Our findings suggest that context-based email filtering has the most potential in improving quality by learning various contexts such as n-gram phrases, linguistic constructs or users´ profile based context to tailor his/her filtering scheme.
Keywords :
Internet; classification; information filtering; text analysis; unsolicited e-mail; Internet; World Wide Web; context-based email filtering; email classification; legitimate email; linguistic construct; n-gram phrase; spam email; text classification; textual content; user profile; Application software; Bayesian methods; Electronic mail; Explosives; Information filtering; Information filters; Internet; Machine learning; Text categorization; Web sites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Computing (ICMLC), 2010 Second International Conference on
Conference_Location :
Bangalore
Print_ISBN :
978-1-4244-6006-9
Electronic_ISBN :
978-1-4244-6007-6
Type :
conf
DOI :
10.1109/ICMLC.2010.61
Filename :
5460695
Link To Document :
بازگشت