DocumentCode
3717193
Title
Accelerating collaborative filtering using concepts from high performance computing
Author
Mark Gates;Hartwig Anzt;Jakub Kurzak;Jack Dongarra
Author_Institution
Innovative Computing Lab University of Tennessee Knoxville, USA
fYear
2015
Firstpage
667
Lastpage
676
Abstract
In this paper we accelerate the Alternating Least Squares (ALS) algorithm used for generating product recommendations on the basis of implicit feedback datasets. We approach the algorithm with concepts proven to be successful in High Performance Computing. This includes the formulation of the algorithm as a mix of cache-optimized algorithm-specific kernels and standard BLAS routines, acceleration via graphics processing units (GPUs), use of parallel batched kernels, and autotuning to identify performance winners. For benchmark datasets, the multi-threaded CPU implementation we propose achieves more than a 10 times speedup over the implementations available in the GraphLab and Spark MLlib software packages. For the GPU implementation, the parameters of an algorithm-specific kernel were optimized using a comprehensive autotuning sweep. This results in an additional 2 times speedup over our CPU implementation.
Keywords
"Yttrium","Kernel","Sparse matrices","Collaboration","Filtering","Symmetric matrices","Acceleration"
Publisher
ieee
Conference_Titel
Big Data (Big Data), 2015 IEEE International Conference on
Type
conf
DOI
10.1109/BigData.2015.7363811
Filename
7363811
Link To Document