• DocumentCode
    1783617
  • Title

    A novel learning method to classify data streams in the internet of things

  • Author

    Khan, Muhammad Asad ; Khan, Ajmal ; Khan, M.N. ; Anwar, Sohel

  • Author_Institution
    Dept. of Inf. Technol., Univ. of Haripur, Haripur, Pakistan
  • fYear
    2014
  • fDate
    11-12 Nov. 2014
  • Firstpage
    61
  • Lastpage
    66
  • Abstract
    Data streams are high volume of multi-dimensional unlabeled data generated in environments such as stock market, astronomical data, Weblogs, Click streams, Flood, Fire and Crops monitoring. Knowledge discovery in data streams is valuable task for research, business and community. The fundamental step of knowledge discovery in data stream is the classification of the data streams in target classes. In this paper we have proposed classification mechanism for the data streams, conventional classification algorithm are of little significance in data streams due to the complex nature, unbounded memory requirements and concept drifting problem in data streams. The proposed method takes a novel approach towards the classification of the data streams through applying unsupervised classification techniques such as clustering followed by supervised classifier such as Support Vector Machine. The high volume data is sampled and reduced with Simple Aggregation and Approximation (SAX) Density based clustering algorithm DB Scan is applied on the data stream to reveal the number of classes present and subsequently label the data. Support vector Machine (SVM) is a well-known and proven supervised classification algorithm, SVM are applied to classify the label data. We tested our proposed method on the Intel Lab Data set, a data set of four environmental variables (Temperature, Voltage, Humidity, light) collected through 54 Mica2Dot sensors over 36 Days at per second rate. We have sampled the data stream in days and window of certain size n is trained on the SVM classifier. The algorithm is evaluated on different test size and average accuracy of 80% is obtained.
  • Keywords
    Internet of Things; data mining; pattern classification; support vector machines; unsupervised learning; DB Scan; Intel Lab Data set; Internet of things; Mica2Dot sensors; SAX density based clustering algorithm; SVM classifier; data stream classification; drifting problem; environmental variables; humidity; knowledge discovery; learning method; light; memory requirements; simple aggregation and approximation density based clustering algorithm; supervised classification algorithm; supervised classifier; support vector machine; temperature; unsupervised classification techniques; voltage; Hardware; Monitoring; Support vector machines; Density based Clustering; Machine Learning; Supervised Learning; Unsupervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering Conference (NSEC), 2014 National
  • Conference_Location
    Rawalpindi
  • Print_ISBN
    978-1-4799-6161-0
  • Type

    conf

  • DOI
    10.1109/NSEC.2014.6998242
  • Filename
    6998242