DocumentCode
568992
Title
Exploring re-identification risks in public domains
Author
Ramachandran, Aditi ; Singh, Lisa ; Porter, Edward ; Nagle, Frank
Author_Institution
Georgetown Univ., Washington, DC, USA
fYear
2012
fDate
16-18 July 2012
Firstpage
35
Lastpage
42
Abstract
While re-identification of sensitive data has been studied extensively, with the emergence of online social networks and the popularity of digital communications, the ability to use public data for re-identification has increased. This work begins by presenting two different cases studies for sensitive data re-identification. We conclude that targeted re-identification using traditional variables is not only possible, but fairly straightforward given the large amount of public data available. However, our first case study also indicates that large-scale re-identification is less likely. We then consider methods for agencies such as the Census Bureau to identify variables that cause individuals to be vulnerable without testing all combinations of variables. We show the effectiveness of different strategies on a Census Bureau data set and on a synthetic data set.
Keywords
security of data; social networking (online); census bureau data set; data reidentification risk; digital communication; large-scale reidentification; online social network; public data; public domain; sensitive data reidentification; Accuracy; Data privacy; Databases; Facebook; Sociology; Twitter;
fLanguage
English
Publisher
ieee
Conference_Titel
Privacy, Security and Trust (PST), 2012 Tenth Annual International Conference on
Conference_Location
Paris
Print_ISBN
978-1-4673-2323-9
Electronic_ISBN
978-1-4673-2325-3
Type
conf
DOI
10.1109/PST.2012.6297917
Filename
6297917
Link To Document