• DocumentCode
    568992
  • Title

    Exploring re-identification risks in public domains

  • Author

    Ramachandran, Aditi ; Singh, Lisa ; Porter, Edward ; Nagle, Frank

  • Author_Institution
    Georgetown Univ., Washington, DC, USA
  • fYear
    2012
  • fDate
    16-18 July 2012
  • Firstpage
    35
  • Lastpage
    42
  • Abstract
    While re-identification of sensitive data has been studied extensively, with the emergence of online social networks and the popularity of digital communications, the ability to use public data for re-identification has increased. This work begins by presenting two different cases studies for sensitive data re-identification. We conclude that targeted re-identification using traditional variables is not only possible, but fairly straightforward given the large amount of public data available. However, our first case study also indicates that large-scale re-identification is less likely. We then consider methods for agencies such as the Census Bureau to identify variables that cause individuals to be vulnerable without testing all combinations of variables. We show the effectiveness of different strategies on a Census Bureau data set and on a synthetic data set.
  • Keywords
    security of data; social networking (online); census bureau data set; data reidentification risk; digital communication; large-scale reidentification; online social network; public data; public domain; sensitive data reidentification; Accuracy; Data privacy; Databases; Facebook; Sociology; Twitter;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Privacy, Security and Trust (PST), 2012 Tenth Annual International Conference on
  • Conference_Location
    Paris
  • Print_ISBN
    978-1-4673-2323-9
  • Electronic_ISBN
    978-1-4673-2325-3
  • Type

    conf

  • DOI
    10.1109/PST.2012.6297917
  • Filename
    6297917