Title :
Large-Scale Matrix Factorization Using MapReduce
Author :
Sun, Zhengguo ; Li, Tao ; Rishe, Naphtali
Author_Institution :
Sch. of Comput. Sci., Florida Int. Univ., Miami, FL, USA
Abstract :
Due to the popularity of nonnegative matrix factorization and the increasing availability of massive data sets, researchers are facing the problem of factorizing large-scale matrices of dimensions in the orders of millions. Recent research has shown that it is feasible to factorize a million-by-million matrix with billions of nonzero elements on a MapReduce cluster. In this work, we present three different matrix multiplication implementations and scale up three types of nonnegative matrix factorizations on MapReduce. Experiments on both synthetic and real-world datasets show the excellent scalability of our proposed algorithms.
Keywords :
distributed programming; matrix decomposition; matrix multiplication; MapReduce; data mining; large-scale matrix factorization; machine learning; matrix multiplication; nonnegative matrix factorization; real-world datasets; Map Reduce; nonnegative matrix factorization;
Conference_Titel :
Data Mining Workshops (ICDMW), 2010 IEEE International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-9244-2
Electronic_ISBN :
978-0-7695-4257-7
DOI :
10.1109/ICDMW.2010.155