Title :
Detection and limitation of interval inference in statistical databases
Author :
Boyens, Clasus ; Günther, Oliver
Author_Institution :
Inst. of Inf. Syst., Humboldt Univ. Berlin, Germany
Abstract :
Interval inference is a specific kind of statistical disclosure where a snooper collects and analyzes publicly available data to determine tight bounds on confidential numerical data. Institutions that disseminate public data include Census Bureaus and other independent organizations such as regional healthcare initiatives that provide chronic disease data that is collected from physicians, pharmacies and health maintenance organizations (HMOs). Such initiatives must ensure that the confidential values of the data providers are protected against interval inference while making sure that the released information is still useful for the prospective data users (such as medical researchers). In this paper, we consider the important case of 2-dimensional tables where the rows correspond to the data providers and the columns to confidential data categories. Although the inner cells of this table are confidential and should under no circumstances be published, marginal information about central tendency and dispersion can still be useful and worth publishing. It is the task of the data-disseminating institution to elicit these specific marginal data elements for publication such that no tight bounds on any inner table cell can be inferred. We present a new method that maximizes the usefulness of the disseminated information to the prospective data users while ensuring the confidentiality of the inner table cell values. We give a computational analysis and compare our methods to existing statistical disclosure methods.
Keywords :
data privacy; information dissemination; public information systems; statistical databases; 2D tables; computational analysis; confidential data categories; confidential numerical data; data analysis; data confidentiality; data providers; data users; data-disseminating institution; information dissemination; interval inference detection; interval inference limitation; marginal data elements; marginal information; public data dissemination; publicly available data; statistical databases; statistical disclosure methods; Conference management; Databases;
Conference_Titel :
Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on
Print_ISBN :
0-7695-2146-0
DOI :
10.1109/SSDM.2004.1311244