Title :
Combination of Time Series, Decision Tree and Clustering: A Case Study in Aerology Event Prediction
Author :
Lajevardi, Seyed Behzad ; Minaei-Bidgoli, Behrouz
Author_Institution :
Comput. Eng. Dept., Iran Univ. of Sci. & Technol., Tehran
Abstract :
Predictive systems use historical and other available data to predict an event. In this paper we propose a general framework to predict the Aerology events with time series streams and events stream using combination of K-means clustering algorithm and Decision Tree C5 algorithm. Firstly, we find the closest time series record for any events; therefore, we have gathered different parameters value when an event is occurring. Using K-means we add a field to data set which determines the cluster of each record after that by using C5 algorithm we predict events. C5 Decision Tree Algorithm is one of the well-known Decision Tree Algorithms. This framework and time series model can predict future events efficiently. We gathered 1961 until 2005 data of aerology organization for Tehran Mehrabad Station. This data contains some fields such as wet bulb, relative humidity, amount of cloud, wind speed and etc. This data set includes 17 types of events. Time series models can predict next time series parameters value and by using this Framework the closest event can be predicted. The C5 method is able to predict Events with Correct 74.11 percent and Wrong 25.89 percent. But with the aims of K-means clustering algorithm the prediction increase to 85 percent and wrong to 15 percent. 90 percent of data was used for training set and 10 percent for test set. We use 10-fold cross validation to evaluate our prediction rate. This framework is the first estimation in the area of event prediction for a huge data set of aerology and can be extended in many different data sets in any other environments.
Keywords :
decision trees; geophysics computing; pattern clustering; time series; C5 algorithm; K-means clustering algorithm; Tehran Mehrabad Station; aerology event prediction; decision tree; events stream; predictive systems; time series; Clustering algorithms; Computer errors; Data engineering; Data mining; Decision trees; Humidity; Partitioning algorithms; Prediction algorithms; Predictive models; Testing; Artificial Neural Network; Data Mining; Decision Tree; Prediction; Time Series;
Conference_Titel :
Computer and Electrical Engineering, 2008. ICCEE 2008. International Conference on
Conference_Location :
Phuket
Print_ISBN :
978-0-7695-3504-3
DOI :
10.1109/ICCEE.2008.110