Title :
VIF Regression: A Fast Regression Algorithm for Large Data
Author :
Lin, Dongyu ; Foster, Dean P.
Author_Institution :
Dept. of Stat., Univ. of Pennsylvania, Philadelphia, PA, USA
Abstract :
We propose a fast regression algorithm that can substantially reduce the computational complexity of searching, yet retain good accuracy. It also guarantees to discover correlated features that are collectively predictive, and avoid model over-fitting. Its capability of controlling mFDR (marginal False Discovery Rate) statistically enables the one-pass search of the fast algorithm and guarantees the accuracy of the sparse model chosen by the algorithm without cross validation. Numerical results show that our algorithm is much faster than any other algorithm and is competitively as accurate as the best but slower algorithms.
Keywords :
computational complexity; regression analysis; VIF regression; computational complexity; cross validation; mFDR; marginal false discovery rate; regression algorithm; sparse model; Computational complexity; Computational modeling; Data mining; Global Positioning System; Input variables; Large-scale systems; Predictive models; Statistics; Testing; false discovery rate; stepwise regression; variable selection; variance inflation factor;
Conference_Titel :
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4244-5242-2
Electronic_ISBN :
1550-4786
DOI :
10.1109/ICDM.2009.146