• DocumentCode
    2793834
  • Title

    A nonnegative sparsity induced similarity measure with application to cluster analysis of spam images

  • Author

    Gao, Yan ; Choudhary, Alok ; Hua, Gang

  • Author_Institution
    Dept. of EECS, Northwestern Univ., Evanston, IL, USA
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    5594
  • Lastpage
    5597
  • Abstract
    Image spam is an email spam that embeds text content into graphical images to bypass traditional spam filters. The majority of previous approaches focus on filtering image spam from client side. To effectively detect the attack activities of the spammers and fast trace back the spam sources, it is also essential to employ cluster analysis to comprehensively filter the image emails on the server side. In this paper, we present a nonnegative sparsity induced similarity measure for cluster analysis of spam images. This similarity measure is based on an assumption that a spam image should be represented well by the nonnegative linear combination of a small number of spam images in the same cluster. It is due to the observation that spammers generate large number of varieties from a single image source with different image processing and manipulation techniques. Experiments on a spam image dataset collected from our department email server demonstrated the advantages of the proposed approach.
  • Keywords
    image processing; information filtering; pattern clustering; text analysis; unsolicited e-mail; attack detection; cluster analysis; email spam; graphical image; image processing; nonnegative sparsity induced similarity measure; spam filter; spam image; text content; Automatic testing; Filtering; Image analysis; Image generation; Image processing; Optical character recognition software; Optical filters; Performance analysis; Unsolicited electronic mail; Weapons; Cluster analysis; Image spam filtering; Nonnegative sparse representation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495246
  • Filename
    5495246