• DocumentCode
    3720272
  • Title

    A novel approach of data sanitization by noise addition and knowledge discovery by clustering

  • Author

    Hadi Abdullah;Ahsan Siddiqi;Fuad Bajaber

  • Author_Institution
    University of Richmond, Richmond, USA
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    9
  • Abstract
    Security of published data cannot be less important as compared to unpublished data or the data which is not made public. Therefore, PII (Personally Identifiable Information) is removed and data sanitized when organizations recording large volumes of data publish that data. However, this approach of ensuring data privacy and security can result in loss of utility of that published data for knowledge discovery. Therefore, a balance is required between privacy and the utility needs of published data. In this paper we study this delicate balance by evaluating four data mining clustering techniques for knowledge discovery and propose two privacy/utility quantification parameters. We subsequently perform number of experiments to statistically identify which clustering technique is best suited with desirable level of privacy/utility while noise is incrementally increased by simultaneously degrading data accuracy, completeness and consistency.
  • Keywords
    "Data privacy","Privacy","Databases","Knowledge discovery","Data security"
  • Publisher
    ieee
  • Conference_Titel
    Computer Networks and Information Security (WSCNIS), 2015 World Symposium on
  • Print_ISBN
    978-1-4799-9906-4
  • Type

    conf

  • DOI
    10.1109/WSCNIS.2015.7368283
  • Filename
    7368283