DocumentCode :
2515500
Title :
Highly scalable parallel collaborative filtering algorithm
Author :
Narang, Ankur ; Gupta, Raj ; Joshi, Anupam ; Garg, Vikas K.
Author_Institution :
IBM India Res. Lab., New Delhi, India
fYear :
2010
fDate :
19-22 Dec. 2010
Firstpage :
1
Lastpage :
10
Abstract :
Collaborative filtering (CF) based recommender systems have gained wide popularity in Internet companies like Amazon, Netflix, Google News, and others. These systems make automatic predictions about the interests of a user by inferring from information about like-minded users. Realtime CF on highly sparse massive datasets, while achieving a high prediction accuracy, is a computationally challenging problem. In this paper, we present the design of a soft real-time (around 1 min.) parallel CF algorithm based on the Concept Decomposition technique. Our parallel algorithm has been optimized for multicore/many-core architectures while maintaining the prediction accuracy of 0.84 RMSE. Using the Netflix dataset, we demonstrate the performance and scalability of our algorithm (in both batch mode and online mode) on a 32-core Power6 based SMP system. Our parallel algorithm delivered training time of 64s on the full Netflix dataset and prediction time of 4.5s on 1.4M ratings (3.2/μs per rating prediction). This is 12.6× better than the best known sequential training time and around 33 × better than the best known sequential prediction time, along with high accuracy (0.84 RMSE). To the best of our knowledge, this is also the best known parallel performance at such high accuracy.
Keywords :
Internet; information filtering; recommender systems; Internet; Netflix dataset; Power6 based SMP system; concept decomposition technique; highly scalable parallel collaborative filtering algorithm; many-core architecture; multicore architecture; parallel algorithm; recommender systems; sequential prediction time; Algorithm design and analysis; Approximation algorithms; Approximation methods; Clustering algorithms; Instruction sets; Matrix decomposition; Motion pictures;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing (HiPC), 2010 International Conference on
Conference_Location :
Dona Paula
Print_ISBN :
978-1-4244-8518-5
Electronic_ISBN :
978-1-4244-8519-2
Type :
conf
DOI :
10.1109/HIPC.2010.5713175
Filename :
5713175
Link To Document :
بازگشت