• DocumentCode
    2554810
  • Title

    EvilSeed: A Guided Approach to Finding Malicious Web Pages

  • Author

    Invernizzi, L. ; Comparetti, Paolo Milani ; Benvenuti, S. ; Kruegel, Christopher ; Cova, M. ; Vigna, Giovanni

  • Author_Institution
    UC Santa Barbara, Santa Barbara, CA, USA
  • fYear
    2012
  • fDate
    20-23 May 2012
  • Firstpage
    428
  • Lastpage
    442
  • Abstract
    Malicious web pages that use drive-by download attacks or social engineering techniques to install unwanted software on a user\´s computer have become the main avenue for the propagation of malicious code. To search for malicious web pages, the first step is typically to use a crawler to collect URLs that are live on the Internet. Then, fast prefiltering techniques are employed to reduce the amount of pages that need to be examined by more precise, but slower, analysis tools (such as honey clients). While effective, these techniques require a substantial amount of resources. A key reason is that the crawler encounters many pages on the web that are benign, that is, the "toxicity" of the stream of URLs being analyzed is low. In this paper, we present EVILSEED, an approach to search the web more efficiently for pages that are likely malicious. EVILSEED starts from an initial seed of known, malicious web pages. Using this seed, our system automatically generates search engines queries to identify other malicious pages that are similar or related to the ones in the initial seed. By doing so, EVILSEED leverages the crawling infrastructure of search engines to retrieve URLs that are much more likely to be malicious than a random page on the web. In other words EVILSEED increases the "toxicity" of the input URL stream. Also, we envision that the features that EVILSEED presents could be directly applied by search engines in their prefilters. We have implemented our approach, and we evaluated it on a large-scale dataset. The results show that EVILSEED is able to identify malicious web pages more efficiently when compared to crawler-based approaches.
  • Keywords
    Internet; search engines; security of data; EvilSeed; Internet; URL; crawler-based approaches; drive-by download attacks; fast prefiltering techniques; malicious Web pages; malicious code; search engines queries; social engineering techniques; unwanted software; Crawlers; Feature extraction; Google; Malware; Search engines; Web pages; Drive-By Downloads; Guided Crawl-Web Security; Guided Crawling; Web Security;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Security and Privacy (SP), 2012 IEEE Symposium on
  • Conference_Location
    San Francisco, CA
  • ISSN
    1081-6011
  • Print_ISBN
    978-1-4673-1244-8
  • Electronic_ISBN
    1081-6011
  • Type

    conf

  • DOI
    10.1109/SP.2012.33
  • Filename
    6234428