• Title of article

    Cybersecurity attacks: Which dataset should be used to evaluate an intrusion detection system?

  • Author/Authors

    Protić ، Danijela D. Department for Telecommunication and Informatics - Center for Applied Mathematics and Electronics - Serbian Armed Forces, General Staff , Stanković ، Miomir M. Mathematical Institute - Serbian Academy of Sciences and Arts

  • From page
    970
  • To page
    995
  • Abstract
    Introduction: Analyzing the high-dimensional datasets used for intrusion detection becomes a challenge for researchers. This paper presents the most often used data sets. ADFA contains two data sets containing records from Linux/Unix. AWID is based on actual traces of normal and intrusion activity of an IEEE 802.11 Wi-Fi network. CAIDA collects data types in geographically and topologically diverse regions. In CIC-IDS2017, HTTP, HTTPS, FTP, SSH, and email protocols are examined. CSECIC-2018 includes abstract distribution models for applications, protocols, or lower-level network entities. DARPA contains data of network traffic. ISCX 2012 dataset has profiles on various multi-stage attacks and actual network traffic with background noise. KDD Cup 99 is a collection of data transfer from a virtual environment. Kyoto 2006+ contains records of real network traffic. It is used only for anomaly detection. NSL-KDD corrects flaws in the KDD Cup 99 caused by redundant and duplicate records. UNSW-NB-15 is derived from real normal data and the synthesized contemporary attack activities of the network traffic. Methods: This study uses both quantitative and qualitative techniques. The scientific references and publicly accessible information about given dataset are used. Results: Datasets are often simulated to meet objectives required by a particular organization. The number of real datasets are very small compared to simulated dataset. Anomaly detection is rarely used today. Conclusion: The main characteristics and a comparative analysis of the data sets in terms of the date they were created, the size, the number of features, the traffic types, and the purpose are presented.
  • Keywords
    ADFA , AWID , CAIDA , CIC , IDS , 2017 , CSE , CIC , 2018 , DARPA , ISCX 2012 , KDD Cup 99 , Kyoto 2006+ , NSL , KDD , UNSWNB15
  • Journal title
    Military Technical Courier
  • Journal title
    Military Technical Courier
  • Record number

    2773308