• DocumentCode
    2748950
  • Title

    Classification of Malware Based on String and Function Feature Selection

  • Author

    Islam, Rafiqul ; Tian, Ronghua ; Batten, Lynn ; Versteeg, Steve

  • Author_Institution
    Sch. of IT, Deakin Univ., Melbourne, VIC, Australia
  • fYear
    2010
  • fDate
    19-20 July 2010
  • Firstpage
    9
  • Lastpage
    17
  • Abstract
    Anti-malware software producers are continually challenged to identify and counter new malware as it is released into the wild. A dramatic increase in malware production in recent years has rendered the conventional method of manually determining a signature for each new malware sample untenable. This paper presents a scalable, automated approach for detecting and classifying malware by using pattern recognition algorithms and statistical methods at various stages of the malware analysis life cycle. Our framework combines the static features of function length and printable string information extracted from malware samples into a single test which gives classification results better than those achieved by using either feature individually. In our testing we input feature information from close to 1400 unpacked malware samples to a number of different classification algorithms. Using k-fold cross validation on the malware, which includes Trojans and viruses, along with 151 clean files, we achieve an overall classification accuracy of over 98%.
  • Keywords
    invasive software; pattern recognition; statistical analysis; function feature selection; k-fold cross validation; malware analysis life cycle; malware classification; pattern recognition algorithm; static feature; statistical method; string feature selection; Accuracy; Data mining; Databases; Feature extraction; Malware; Software; Support vector machine classification; Malware; classification; function length; string;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cybercrime and Trustworthy Computing Workshop (CTC), 2010 Second
  • Conference_Location
    Ballarat, VIC
  • Print_ISBN
    978-1-4244-8054-8
  • Electronic_ISBN
    978-0-7695-4186-0
  • Type

    conf

  • DOI
    10.1109/CTC.2010.11
  • Filename
    5615149