DocumentCode :
1631383
Title :
The price of privacy in untrusted recommendation engines
Author :
Banerjee, Sean ; Hegde, Nayana ; Massoulie, Laurent
Author_Institution :
Dept. of ECE, Univ. of Texas at Austin, Austin, TX, USA
fYear :
2012
Firstpage :
920
Lastpage :
927
Abstract :
Recent increase in online privacy concerns prompts the following question: can a recommendation engine be accurate if end-users do not entrust it with their private data? To answer this, we study the problem of predicting user-ratings under local or `user-end´ differential privacy, a powerful, formal notion of data privacy. We develop a systematic approach for lower bounds on the complexity of learning item structure from privatized user inputs, based on mutual information. Our results identify a sample complexity separation between learning in the scarce information regime and the rich information regime, thereby highlighting the role of the amount of ratings (information) available to each user. In the information-rich regime (where each user rates a constant fraction of items), a spectral clustering approach is shown to achieve optimal sample complexity. However, the information-scarce regime (where each user rates only a vanishing fraction of the total item set) is found to require a fundamentally different approach. We propose a new algorithm, MaxSense, and show that it achieves optimal sample complexity in this setting. The techniques we develop for bounding mutual information may be of broader interest. To illustrate this, we show their applicability to (i) learning based on 1-bit sketches (in contrast to differentially private sketches), and (ii) adaptive learning, where queries can be adapted based on answers to past queries.
Keywords :
Internet; computational complexity; data privacy; query processing; recommender systems; sampling methods; trusted computing; 1-bit sketches; MaxSense algorithm; adaptive learning; data privacy; end-users; information-rich regime; information-scarce regime; learning item structure complexity; lower bounds; mutual information; online privacy concerns; optimal sample complexity; privatized user inputs; queries; spectral clustering approach; untrusted recommendation engines; user-end differential privacy; user-rating prediction; Algorithm design and analysis; Clustering algorithms; Complexity theory; Data privacy; Mutual information; Privacy; Recommender systems;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communication, Control, and Computing (Allerton), 2012 50th Annual Allerton Conference on
Conference_Location :
Monticello, IL
Print_ISBN :
978-1-4673-4537-8
Type :
conf
DOI :
10.1109/Allerton.2012.6483317
Filename :
6483317
Link To Document :
بازگشت