مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

1662973

Title :

Baseline Semantic Spam Filtering

Author :

Hempelmann, Christian F. ; Mehra, Vikas

Author_Institution :

RiverGlass Inc., Champaign, IL, USA

Volume :

fYear :

2011

Firstpage :

273

Lastpage :

276

Abstract :

This paper presents a meaning-based method to distinguish text without or with little semantic content from text that has meaning which can be processed. The basic method assumes that a semantic analyzer will be able to produce less output from semantically less grammatical input text. The method was pilot-tested on a corpus of blog spam. Future improvements, including a method to distinguish semantically unified from semantically disparate text are sketched. The tested method, but even more the projected improvements, open up the way to taking the spam filtering arms race to a new level that is very costly to spam producers.

Keywords :

e-mail filters; ontologies (artificial intelligence); text analysis; unsolicited e-mail; baseline semantic spam filtering; blog spam corpus; grammatical input text; meaning-based method; semantic analyzer; spam filtering; text distinguishing; Color; Conferences; Filtering; Humans; Natural languages; Semantics; Tunneling magnetoresistance; ontological semantics; semantics; spam filter;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on

Conference_Location :

Lyon

Print_ISBN :

978-1-4577-1373-6

Electronic_ISBN :

978-0-7695-4513-4

Type :

conf

DOI :

10.1109/WI-IAT.2011.133

Filename :

6040858

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1662973