DocumentCode :
2856848
Title :
Data preprocessing for prediction of rerecirculating water chemistry faults
Author :
Qiang, Gao ; Xinmin, Wang ; Chao, Dong ; Chenguang, Li
Author_Institution :
Tianjin Key Lab. for Control Theor. & Applic. in Complicated Syst., Tianjin Univ. of Technol., Tianjin, China
Volume :
14
fYear :
2010
fDate :
22-24 Oct. 2010
Abstract :
The water quality data in some petrochemical company are stored in the lims database, in order to support the field operation´s decision-making according to these data, it´s necessary to do some appropriate data mining. However, the accuracy of the results of data mining directly associated with the quality of source data, so data preprocessing on the raw data is necessary in the data mining process. The process of data preprocessing is as follows: First, the classification of the raw data, here it is based on the frequency difference of the data acquisition. Second, data cleaning on the raw data, including cleaning the noisy data, missing data and redundant data (here mainly refers to the attribute redundancy).For the noisy data it take the method combining computer with artificial, for the missing data it mainly take the interpolation method using the mean data, and the redundancy property items were deleted; Third, data transformation for the later data processing. It mainly do the normalization to eliminate the difference of the dimension and magnitude on the raw data, so that all data can put together to make a comprehensive analysis. It uses the mean and standard deviation method to preprocessing the raw data and then make a compare of the result of the two different methods and it get the conclusion that the mean method is a better normalization method; fourth, to conduct data reduction, which refers to reduce the data storage space as far as possible while it must ensure the data integrity, it uses the principal component analysis method to do the job.
Keywords :
data mining; data reduction; fault diagnosis; interpolation; petrochemicals; principal component analysis; data mining; data preprocessing; data reduction; interpolation method; lims database; normalization method; petrochemical company; principal component analysis; rerecirculating water chemistry faults prediction; water quality data; Analytical models; Production; Silicon compounds; Solids; Viscosity; data cleaning; data mining; data preprocessing; data reduction; data transformation; principal component analysis; recirculating water;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Application and System Modeling (ICCASM), 2010 International Conference on
Conference_Location :
Taiyuan
Print_ISBN :
978-1-4244-7235-2
Electronic_ISBN :
978-1-4244-7237-6
Type :
conf
DOI :
10.1109/ICCASM.2010.5622170
Filename :
5622170
Link To Document :
بازگشت