On Security of Statistical Databases

  • Authors:
  • R. Ahlswede;H. Aydinian

  • Affiliations:
  • -;ayd@math.uni-bielefeld.de

  • Venue:
  • SIAM Journal on Discrete Mathematics
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A statistical database (SDB) is a database that is used to return statistical information derived from the records to user queries for statistical data analysis. Sometimes, by correlating enough statistics, confidential data (stored in an SDB) about an individual can be inferred. Examples of confidential information stored in an SDB might be salaries or data concerning the medical history of individuals. An important problem is to provide security to SDBs against the disclosure of confidential information. An SDB is said to be secure if no protected data can be inferred from the available queries. One of the security-control methods suggested in the literature consists of query restriction: the security problem is to limit the use of the SDB, introducing a control mechanism, such that no protected data can be obtained from the available queries. Chin and Ozsoyoglu [IEEE Trans. Software Engrg., 8 (1982), pp. 574-582] introduced a control mechanism, called AUDIT EXPERT, where only SUM queries, that is, only certain sums of individual records, are available for the users. This SUM query model leads to several challenging optimization problems. Assume there are $n$ numeric records $\{z_1,\ldots,z_n\}$ stored in database. A natural problem is to maximize the number of answerable SUM queries, that is, the number of subset sums of $\{z_1,\ldots,z_n\}$ (possibly with some additional constraints), that can be returned, such that none of numbers $z_i$ (or sums of subsets of size not exceeding a specified number) can be inferred from these queries. In this paper we give tight bounds for this number under constraints on size and dimension of query subsets.