Title :
Adaptive neural-fuzzy inference system for classification of rail quality data with bootstrapping-based over-sampling
Author :
Yang, Y.Y. ; Mahfouf, M. ; Panoutsos, G. ; Zhang, Q. ; Thornton, S.
Author_Institution :
Dept. of Autom. Control & Syst. Eng., Univ. of Sheffield, Sheffield, UK
Abstract :
An iterative bootstrapping-based data over-sampling strategy is presented in this paper together with an adaptive neural-fuzzy inference system (ANFIS) to deal with a severely imbalanced data modelling problem. As real industrial data are often very large, containing hundreds of process variables and a huge number of data records, the selection of a compact set of input variables becomes critical for any successful modelling and analysis operations. Significant efforts have been devoted to identifying the most relevant input variables through correlation analysis and neural network based forward input selection. An optimal majority to minority class data ratio, which controls the level of data imbalance for model training, is then determined through the iterative bootstrapping process such that the combined sensitivity and specificity performance is optimised. The iterative bootstrapping ANFIS modelling strategy is then applied to a real industrial case study for rail quality classification, with the original data being provided by Tata Steel Europe. Preliminary results show a good overall performance through the iterative bootstrapping data over-sampling ANFIS modelling.
Keywords :
adaptive systems; fuzzy reasoning; neural nets; railways; sampling methods; Tata Steel Europe; adaptive neural-fuzzy inference system; analysis operations; bootstrapping-based over-sampling; compact set; correlation analysis; data records; forward input selection; input variables; iterative bootstrapping ANFIS modelling strategy; iterative bootstrapping process; iterative bootstrapping-based data over-sampling strategy; model training; neural network; rail quality classification; rail quality data classification; real industrial data; severely imbalanced data modelling problem; successful modelling; Artificial neural networks; Computational modeling; Correlation; Data models; Rails; Steel; ANFIS; Imbalance data; bootstrapping; data resampling; fuzzy c-means clustering; rail; steel manufacturing;
Conference_Titel :
Fuzzy Systems (FUZZ), 2011 IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-7315-1
Electronic_ISBN :
1098-7584
DOI :
10.1109/FUZZY.2011.6007729