Title :
Evaluating parallel logistic regression models
Author :
Haoruo Peng ; Ding Liang ; Choi, Chinchul
Author_Institution :
HTC Res. Center, Beijing, China
Abstract :
Logistic regression (LR) has been widely used in applications of machine learning, thanks to its linear model. However, when the size of training data is very large, even such a linear model can consume excessive memory and computation time. To tackle both resource and computation scalability in a big-data setting, we evaluate and compare different approaches in distributed platform, parallel algorithm, and sublinear approximation. Our empirical study provides design guidelines for choosing the most effective combination for the performance requirement of a given application.
Keywords :
Big Data; approximation theory; design; learning (artificial intelligence); parallel algorithms; regression analysis; Big Data; computation scalability; design guidelines; distributed platform; linear model; machine learning; parallel algorithm; parallel logistic regression models; performance requirement; sublinear approximation; Algorithm design and analysis; Approximation algorithms; Computational modeling; Logistics; Machine learning algorithms; Sparks; Vectors; Big Data; Logistic Regression Model; Parallel Computing; Sublinear Method;
Conference_Titel :
Big Data, 2013 IEEE International Conference on
Conference_Location :
Silicon Valley, CA
DOI :
10.1109/BigData.2013.6691743