مرکز منطقه ای اطلاع رساني علوم و فناوري - Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems

DocumentCode :

2984419

Title :

Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems

Author :

Hsiang-Fu Yu ; Cho-Jui Hsieh ; Si si ; Dhillon, I.

Author_Institution :

Dept. of Comput. Sci., Univ. of Texas at Austin, Austin, TX, USA

fYear :

2012

fDate :

10-13 Dec. 2012

Firstpage :

765

Lastpage :

774

Abstract :

Matrix factorization, when the matrix has missing values, has become one of the leading techniques for recommender systems. To handle web-scale datasets with millions of users and billions of ratings, scalability becomes an important issue. Alternating Least Squares (ALS) and Stochastic Gradient Descent (SGD) are two popular approaches to compute matrix factorization. There has been a recent flurry of activity to parallelize these algorithms. However, due to the cubic time complexity in the target rank, ALS is not scalable to large-scale datasets. On the other hand, SGD conducts efficient updates but usually suffers from slow convergence that is sensitive to the parameters. Coordinate descent, a classical optimization approach, has been used for many other large-scale problems, but its application to matrix factorization for recommender systems has not been explored thoroughly. In this paper, we show that coordinate descent based methods have a more efficient update rule compared to ALS, and are faster and have more stable convergence than SGD. We study different update sequences and propose the CCD++ algorithm, which updatesrank-one factors one by one. In addition, CCD++ can be easily parallelized on both multi-core and distributed systems. We empirically show that CCD++ is much faster than ALS and SGD in both settings. As an example, on a synthetic dataset with 2 billion ratings, CCD++ is 4 times faster than both SGD and ALS using a distributed system with 20 machines.

Keywords :

computational complexity; convergence; gradient methods; matrix decomposition; optimisation; parallel algorithms; recommender systems; stochastic processes; ALS; CCD++ algorithm; SGD; alternating least squares; classical optimization approach; coordinate descent based methods; cubic time complexity; distributed systems; large-scale datasets; large-scale problems; multicore systems; parallel algorithms; parallel matrix factorization; recommender systems; scalable coordinate descent; slow convergence; stable convergence; stochastic gradient descent; synthetic dataset; update rule; update sequences; updates rank-one factors; web-scale datasets; Acceleration; Charge coupled devices; Convergence; Least squares approximation; Recommender systems; Time complexity; Low rank approximation; Matrix factorization; Parallelization; Recommender systems;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Mining (ICDM), 2012 IEEE 12th International Conference on

Conference_Location :

Brussels

ISSN :

1550-4786

Print_ISBN :

978-1-4673-4649-8

Type :

conf

DOI :

10.1109/ICDM.2012.168

Filename :

6413853

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2984419