• DocumentCode
    3756870
  • Title

    Boosting the Detection of Malicious Documents Using Designated Active Learning Methods

  • Author

    Nir Nissim;Aviad Cohen;Yuval Elovici

  • Author_Institution
    Dept. of Inf. Syst. Eng., Ben-Gurion Univ. of the Negev, Beer-Sheva, Israel
  • fYear
    2015
  • Firstpage
    760
  • Lastpage
    765
  • Abstract
    Most organizations usually create, send and receive huge amounts of documents daily, Attackers increasingly take advantage of innocent users who tend to casually open email massages assumed to be benign, carrying malicious documents. Recent targeted attacks aimed at organizations, utilize the new Microsoft Word documents (*.docx). Anti-virus software fails to detect new unknown malicious files, including malicious docx files. In this study, we present SFEM feature extraction methodology and designated Active Learning (AL) methods, aimed at accurate detection of new unknown malicious docx files that also efficiently enhances the detection´s model capabilities over time. Our AL methods identify and acquire only small set of new docx files that are most likely malicious, as well as informative benign files, these files are used for enhancing the knowledge stores of both the detection model and the anti-virus software. Results show that our active learning methods used only 14% of the labeled docx files within organization which led to a reduction of 95.5% in labeling efforts compared to passive learning and SVM-Margin (existing active learning method). Our AL methods also showed a significant improvement of 91% in unknown docx malware acquisition compared to passive learning and SVM-Margin, thus providing an improved updating solution for detection model, as well as the anti-virus software widely used within organizations.
  • Keywords
    "Organizations","Feature extraction","XML","Learning systems","Software","Malware"
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2015 IEEE 14th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICMLA.2015.52
  • Filename
    7424413