• DocumentCode
    30125
  • Title

    Correlated Differential Privacy: Hiding Information in Non-IID Data Set

  • Author

    Tianqing Zhu ; Ping Xiong ; Gang Li ; Wanlei Zhou

  • Author_Institution
    Sch. of Inf. Technol., Deakin Univ., Melbourne, VIC, Australia
  • Volume
    10
  • Issue
    2
  • fYear
    2015
  • fDate
    Feb. 2015
  • Firstpage
    229
  • Lastpage
    242
  • Abstract
    Privacy preserving on data mining and data release has attracted an increasing research interest over a number of decades. Differential privacy is one influential privacy notion that offers a rigorous and provable privacy guarantee for data mining and data release. Existing studies on differential privacy assume that in a data set, records are sampled independently. However, in real-world applications, records in a data set are rarely independent. The relationships among records are referred to as correlated information and the data set is defined as correlated data set. A differential privacy technique performed on a correlated data set will disclose more information than expected, and this is a serious privacy violation. Although recent research was concerned with this new privacy violation, it still calls for a solid solution for the correlated data set. Moreover, how to decrease the large amount of noise incurred via differential privacy in correlated data set is yet to be explored. To fill the gap, this paper proposes an effective correlated differential privacy solution by defining the correlated sensitivity and designing a correlated data releasing mechanism. With consideration of the correlated levels between records, the proposed correlated sensitivity can significantly decrease the noise compared with traditional global sensitivity. The correlated data releasing mechanism correlated iteration mechanism is designed based on an iterative method to answer a large number of queries. Compared with the traditional method, the proposed correlated differential privacy solution enhances the privacy guarantee for a correlated data set with less accuracy cost. Experimental results show that the proposed solution outperforms traditional differential privacy in terms of mean square error on large group of queries. This also suggests the correlated differential privacy can successfully retain the utility while preserving the privacy.
  • Keywords
    data encapsulation; data mining; data privacy; iterative methods; correlated data releasing mechanism; correlated data set; correlated differential privacy solution; correlated information; correlated iteration mechanism; correlated levels; correlated sensitivity; data mining; data release; influential privacy notion; information hiding; iterative method; non-IID data set; privacy guarantee; privacy preserving; privacy violation; traditional global sensitivity; Correlation; Couplings; Data privacy; Histograms; Noise; Privacy; Sensitivity; Correlated Dataset; Differential Privacy; Non-IID Dataset; Privacy Preserving; Privacy preserving; correlated dataset; differential privacy; non-IID dataset;
  • fLanguage
    English
  • Journal_Title
    Information Forensics and Security, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1556-6013
  • Type

    jour

  • DOI
    10.1109/TIFS.2014.2368363
  • Filename
    6949097