DocumentCode
2534320
Title
Privacy aware data generation for testing database applications
Author
Wu, Xintao ; Sanghvi, Chintan ; Wang, Yongge ; Zheng, Yuliang
Author_Institution
North Carolina Univ., Charlotte, NC, USA
fYear
2005
fDate
25-27 July 2005
Firstpage
317
Lastpage
326
Abstract
Testing of database applications is of great importance. A significant issue in database application testing consists in the availability of representative data. In this paper, we investigate the problem of generating a synthetic database based on a-priori knowledge about a production database. Our approach is to fit general location model using various characteristics (e.g., constraints, statistics, rules) extracted from the production database and then generate the synthetic data using model learnt. The generated data is valid and similar to real data in terms of statistical distribution, hence it can be used for functional and performance testing. As characteristics extracted may contain information which may be used by attacker to derive some confidential information about individuals, we present our disclosure analysis method which applies cell suppression technique for identity disclosure analysis and perturbation for value disclosure.
Keywords
database management systems; program testing; software performance evaluation; cell suppression; database application testing; identity disclosure analysis; location model; privacy aware data generation; production database; representative data; statistical distribution; synthetic database; value disclosure; Application software; Character generation; Context modeling; Data mining; Data privacy; Databases; Information analysis; Production; Statistical distributions; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Database Engineering and Application Symposium, 2005. IDEAS 2005. 9th International
ISSN
1098-8068
Print_ISBN
0-7695-2404-4
Type
conf
DOI
10.1109/IDEAS.2005.45
Filename
1540922
Link To Document