• DocumentCode
    2285279
  • Title

    Combination of Time Series, Decision Tree and Clustering: A Case Study in Aerology Event Prediction

  • Author

    Lajevardi, Seyed Behzad ; Minaei-Bidgoli, Behrouz

  • Author_Institution
    Comput. Eng. Dept., Iran Univ. of Sci. & Technol., Tehran
  • fYear
    2008
  • fDate
    20-22 Dec. 2008
  • Firstpage
    111
  • Lastpage
    115
  • Abstract
    Predictive systems use historical and other available data to predict an event. In this paper we propose a general framework to predict the Aerology events with time series streams and events stream using combination of K-means clustering algorithm and Decision Tree C5 algorithm. Firstly, we find the closest time series record for any events; therefore, we have gathered different parameters value when an event is occurring. Using K-means we add a field to data set which determines the cluster of each record after that by using C5 algorithm we predict events. C5 Decision Tree Algorithm is one of the well-known Decision Tree Algorithms. This framework and time series model can predict future events efficiently. We gathered 1961 until 2005 data of aerology organization for Tehran Mehrabad Station. This data contains some fields such as wet bulb, relative humidity, amount of cloud, wind speed and etc. This data set includes 17 types of events. Time series models can predict next time series parameters value and by using this Framework the closest event can be predicted. The C5 method is able to predict Events with Correct 74.11 percent and Wrong 25.89 percent. But with the aims of K-means clustering algorithm the prediction increase to 85 percent and wrong to 15 percent. 90 percent of data was used for training set and 10 percent for test set. We use 10-fold cross validation to evaluate our prediction rate. This framework is the first estimation in the area of event prediction for a huge data set of aerology and can be extended in many different data sets in any other environments.
  • Keywords
    decision trees; geophysics computing; pattern clustering; time series; C5 algorithm; K-means clustering algorithm; Tehran Mehrabad Station; aerology event prediction; decision tree; events stream; predictive systems; time series; Clustering algorithms; Computer errors; Data engineering; Data mining; Decision trees; Humidity; Partitioning algorithms; Prediction algorithms; Predictive models; Testing; Artificial Neural Network; Data Mining; Decision Tree; Prediction; Time Series;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Electrical Engineering, 2008. ICCEE 2008. International Conference on
  • Conference_Location
    Phuket
  • Print_ISBN
    978-0-7695-3504-3
  • Type

    conf

  • DOI
    10.1109/ICCEE.2008.110
  • Filename
    4740957