• DocumentCode
    60157
  • Title

    A Probabilistic Discriminative Model for Android Malware Detection with Decompiled Source Code

  • Author

    Lei Cen ; Gates, Christoher S. ; Luo Si ; Ninghui Li

  • Author_Institution
    Purdue Univ., West Lafayette, IN, USA
  • Volume
    12
  • Issue
    4
  • fYear
    2015
  • fDate
    July-Aug. 1 2015
  • Firstpage
    400
  • Lastpage
    412
  • Abstract
    Mobile devices are an important part of our everyday lives, and the Android platform has become a market leader. In recent years a number of approaches for Android malware detection have been proposed, using permissions, source code analysis, or dynamic analysis. In this paper, we propose to use a probabilistic discriminative model based on regularized logistic regression for Android malware detection. Through extensive experimental evaluation, we demonstrate that it can generate probabilistic outputs with highly accurate classification results. In particular, we propose to use Android API calls as features extracted from decompiled source code, and analyze and explore issues in feature granularity, feature representation, feature selection, and regularization. We show that the probabilistic discriminative model also works well with permissions, and substantially outperforms the state-of-the-art methods for Android malware detection with application permissions. Furthermore, the discriminative learning model achieves the best detection results by combining both decompiled source code and application permissions. To the best of our knowledge, this is the first research that proposes probabilistic discriminative model for Android malware detection with a thorough study of desired representation of decompiled source code and is the first research work for Android malware detection task that combines both analysis of decompiled source code and application permissions.
  • Keywords
    Android (operating system); application program interfaces; feature extraction; feature selection; invasive software; learning (artificial intelligence); mobile computing; probability; program compilers; regression analysis; source code (software); Android API; Android malware detection; application permissions; decompiled source code analysis; discriminative learning model; feature extraction; feature granularity; feature representation; feature selection; mobile devices; probabilistic discriminative model; regularized logistic regression; Androids; Feature extraction; Humanoid robots; Malware; Measurement; Probabilistic logic; Smart phones; Android; discriminative model; machine learning; malicious application;
  • fLanguage
    English
  • Journal_Title
    Dependable and Secure Computing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5971
  • Type

    jour

  • DOI
    10.1109/TDSC.2014.2355839
  • Filename
    6894210