DocumentCode :
265334
Title :
Cloud antivirus cost model using machine learning
Author :
Hamzah, Ali Abdullah ; Khattab, Sherif M. ; El-Gamal, Salwa S.
Author_Institution :
Fac. of Comput. & Inf., Cairo Univ., Cairo, Egypt
fYear :
2014
fDate :
15-17 Dec. 2014
Abstract :
An important cloud computing is a new generation of computing and is based on virtualization technology. More and more applications are being deployed in cloud environments. Malware detection or antivirus software has been recently provided as a service in the cloud. A cloud antivirus provider hosts a number of virtual machines each running the same or different antivirus engines on potentially different sets of workloads (files). From the provider´s perspective, the problem of optimally allocating physical resources to these virtual machines is crucial to the efficiency of the infrastructure. We propose a search-based optimization approach for solving the resource allocation problem in cloud-based antivirus deployments. An elaborate cost model of the file scanning process in antivirus programs is instrumental to the proposed approach. The general architecture is presented and discussed, and a preliminary experimental investigation into the antivirus cost model is described. The cost model depends on many factors, such as total file size, size of code segment, and count and type of embedded files within the executable. However, not a single parameter of these can be reliably used alone to predict file scanning time. Thus, a machine-learning approach that combines all these parameters as features is used to build a classifier for antivirus file scanning time. The best results we obtained were using the Decision Tree classifier. The highest F-measure value was 0.91, the highest F-measure value using logitboost was 0.87, the highest F-measure value using support vector machine was 0.85 and the highest F-measure value using naïve Bayes was 0.82. We evaluated the accuracy of the classification model versus linear regression model using the Root Mean Square (RMS) measure. We found that the classification model is more accurate than linear regression model, whereas the values average of RMS were 0.988 second and 2.44 second for classification model and linear re- ression model, respectively.
Keywords :
Bayes methods; cloud computing; computer viruses; embedded systems; learning (artificial intelligence); mean square error methods; pattern classification; regression analysis; resource allocation; support vector machines; virtual machines; virtualisation; F-measure value; RMS measure; antivirus engine; antivirus file scanning time; antivirus program; antivirus software; classification model; cloud antivirus cost model; cloud antivirus provider; cloud computing; cloud environment; cloud-based antivirus deployment; code segment; decision tree classifier; embedded file; file scanning process; linear regression model; logitboost; machine learning; malware detection; naïve Bayes; physical resources allocation; resource allocation problem; root mean square measure; search-based optimization approach; support vector machine; virtual machine; virtualization technology; Decision support systems; Antivirus; Cost Model; Machine Learning; Resource Allocation; Virtualization; cloud Antivirus;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Informatics and Systems (INFOS), 2014 9th International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-977-403-689-7
Type :
conf
DOI :
10.1109/INFOS.2014.7036708
Filename :
7036708
Link To Document :
بازگشت