Title :
A self-organizing method for predictive modeling with highly-redundant variables
Author :
Gang Liu;Hui Yang
Author_Institution :
Complex Systems Monitoring, Modeling and Analysis Laboratory, University of South Florida, Tampa, FL 33620 USA
Abstract :
Rapid advancement of sensing and information technology brings the big data, which presents a gold mine of the 21st century. However, big data also brings significant challenges for data-driven decision making. In particular, it is not uncommon that a large number of variables (or features) underlie the big data. Complex interdependence structures among variables challenge the traditional framework of predictive modeling. This paper presents a new methodology of self-organizing network for variable clustering and predictive modeling. Specifically, we developed a new approach, namely nonlinear coupling analysis to measure nonlinear interdependence structures among variables. Further, all the variables are embedded as nodes in a complex network. Nonlinear-coupling forces move these nodes to derive a self-organizing topology of network. As such, variables are clustered as sub-network communities in the space. Experimental results demonstrated that the proposed methodology not only outperforms traditional variable clustering algorithms such as hierarchical clustering and oblique principal component analysis, but also effectively identify interdependent structures among variables and further improves the performance of predictive modeling. The proposed new idea of self-organizing network is generally applicable for predictive modeling in many disciplines that involve a large number of highly-redundant variables.
Keywords :
"Predictive models","Correlation","Clustering algorithms","Principal component analysis","Force","Big data","Self-organizing networks"
Conference_Titel :
Automation Science and Engineering (CASE), 2015 IEEE International Conference on
Electronic_ISBN :
2161-8089
DOI :
10.1109/CoASE.2015.7294243