DocumentCode
60157
Title
A Probabilistic Discriminative Model for Android Malware Detection with Decompiled Source Code
Author
Lei Cen ; Gates, Christoher S. ; Luo Si ; Ninghui Li
Author_Institution
Purdue Univ., West Lafayette, IN, USA
Volume
12
Issue
4
fYear
2015
fDate
July-Aug. 1 2015
Firstpage
400
Lastpage
412
Abstract
Mobile devices are an important part of our everyday lives, and the Android platform has become a market leader. In recent years a number of approaches for Android malware detection have been proposed, using permissions, source code analysis, or dynamic analysis. In this paper, we propose to use a probabilistic discriminative model based on regularized logistic regression for Android malware detection. Through extensive experimental evaluation, we demonstrate that it can generate probabilistic outputs with highly accurate classification results. In particular, we propose to use Android API calls as features extracted from decompiled source code, and analyze and explore issues in feature granularity, feature representation, feature selection, and regularization. We show that the probabilistic discriminative model also works well with permissions, and substantially outperforms the state-of-the-art methods for Android malware detection with application permissions. Furthermore, the discriminative learning model achieves the best detection results by combining both decompiled source code and application permissions. To the best of our knowledge, this is the first research that proposes probabilistic discriminative model for Android malware detection with a thorough study of desired representation of decompiled source code and is the first research work for Android malware detection task that combines both analysis of decompiled source code and application permissions.
Keywords
Android (operating system); application program interfaces; feature extraction; feature selection; invasive software; learning (artificial intelligence); mobile computing; probability; program compilers; regression analysis; source code (software); Android API; Android malware detection; application permissions; decompiled source code analysis; discriminative learning model; feature extraction; feature granularity; feature representation; feature selection; mobile devices; probabilistic discriminative model; regularized logistic regression; Androids; Feature extraction; Humanoid robots; Malware; Measurement; Probabilistic logic; Smart phones; Android; discriminative model; machine learning; malicious application;
fLanguage
English
Journal_Title
Dependable and Secure Computing, IEEE Transactions on
Publisher
ieee
ISSN
1545-5971
Type
jour
DOI
10.1109/TDSC.2014.2355839
Filename
6894210
Link To Document