DocumentCode :
2725431
Title :
Noise Reduction Approach for Decision Tree Construction: A Case Study of Knowledge Discovery on Climate and Air Pollution
Author :
Fukuda, Kyoko
Author_Institution :
Dept. of Math. & Stat., Comput. Sci. & Software Eng., Canterbury Univ., Christchurch
fYear :
2007
fDate :
March 1 2007-April 5 2007
Firstpage :
697
Lastpage :
704
Abstract :
Data mining is more effective on noisy time series with appropriate data pre-processing. Singular spectrum analysis (SSA) is explored as the noise reduction approach for a decision tree classifier for noisy data. SSA provides groups of additive components, from low to high frequency, by decomposing the noisy time series. In this study, the noisy climate data is decomposed by SSA and is used to construct decision trees to predict the carbon monoxide (CO) air pollution levels. Analysis shows that separating out seasons from the annual data helps the algorithm; the classification accuracy improvements vary by season, with the maximum improvement (from 60.7% to 77.3%) found in summer by removing 6.42% of the high frequency signals, while autumn showed no improvement. Examining decision tree structures provides threshold climate values that impact on different CO levels, e.g., a light wind speed of les 2.5 m/s and any level of temperature inversion formation is found to associate with the high CO level (> 0.70 mg/m3). Overall, data pre-processing using SSA is encouraging to improve the results of any time series data mining approach. Examining decision trees of the climate and air pollution helps increase knowledge about the data, and the studied approaches can be adaptable for various future environmental studies
Keywords :
air pollution; data mining; decision trees; time series; air pollution; carbon monoxide; data mining; data pre-processing; decision tree classifier; decision tree construction; decision tree structures; decision trees; knowledge discovery; noise reduction; noisy climate data; noisy time series; singular spectrum analysis; temperature inversion formation; Additive noise; Air pollution; Carbon dioxide; Classification tree analysis; Data mining; Decision trees; Frequency; Noise level; Noise reduction; Signal analysis; Air pollution; Singular Spectrum Analysis; climate; decision trees;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0705-2
Type :
conf
DOI :
10.1109/CIDM.2007.368944
Filename :
4221368
Link To Document :
بازگشت