Title :
Two-level clustering approach to training data instance selection: A case study for the steel industry
Author :
Koskimäki, Heli ; Juutilainen, Ilmari ; Laurinen, Perttu ; Röning, Juha
Author_Institution :
Dept. of Electr. & Inf. Eng., Univ. of Oulu, Oulu
Abstract :
Nowadays, huge amounts of information from different industrial processes are stored into databases and companies can improve their production efficiency by mining some new knowledge from this information. However, when these databases becomes too large, it is not efficient to process all the available data with practical data mining applications. As a solution, different approaches for intelligent selection of training data for model fitting have to be developed. In this article, training instances are selected to fit predictive regression models developed for optimization of the steel manufacturing process settings beforehand, and the selection is approached from a clustering point of view. Because basic k-means clustering was found to consume too much time and memory for the purpose, a new algorithm was developed to divide the data coarsely, after which k-means clustering could be performed. The instances were selected using the cluster structure by weighting more the observations from scattered and separated clusters. The study shows that by using this kind of approach to data set selection, the prediction accuracy of the models will get even better. It was noticed that only a quarter of the data, selected with our approach, could be used to achieve results comparable with a reference case, while the procedure can be easily developed for an actual industrial environment.
Keywords :
data mining; learning (artificial intelligence); manufacturing processes; optimisation; pattern clustering; regression analysis; steel industry; steel manufacture; data instance selection training; data mining; database; optimization; predictive regression model; steel industry; steel manufacturing process; two-level clustering approach; Data mining; Databases; Fitting; Manufacturing processes; Metals industry; Mining industry; Predictive models; Production; Steel; Training data;
Conference_Titel :
Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-1820-6
Electronic_ISBN :
1098-7576
DOI :
10.1109/IJCNN.2008.4634228