Releasing Individually Identifiable Microdata with Privacy Protection Against Stochastic Threat: An Application to Health Information

  • Authors:
  • Robert Garfinkel;Ram Gopal;Steven Thompson

  • Affiliations:
  • Department of Operations and Information Management, School of Business, University of Connecticut, Storrs, Connecticut 06029;Department of Operations and Information Management, School of Business, University of Connecticut, Storrs, Connecticut 06029;Department of Operations and Information Management, School of Business, University of Connecticut, Storrs, Connecticut 06029

  • Venue:
  • Information Systems Research
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ability to collect and disseminate individually identifiable microdata is becoming increasingly important in a number of arenas. This is especially true in health care and national security, where this data is considered vital for a number of public health and safety initiatives. In some cases legislation has been used to establish some standards for limiting the collection of and access to such data. However, all such legislative efforts contain many provisions that allow for access to individually identifiable microdata without the consent of the data subject. Furthermore, although legislation is useful in that penalties are levied for violating the law, these penalties occur after an individual's privacy has been compromised. Such deterrent measures can only serve as disincentives and offer no true protection. This paper considers security issues involved in releasing microdata, including individual identifiers. The threats to the confidentiality of the data subjects come from the users possessing statistical information that relates the revealed microdata to suppressed confidential information. The general strategy is to recode the initial data, in which some subjects are “safe” and some are at risk, into a data set in which no subjects are at risk. We develop a technique that enables the release of individually identifiable microdata in a manner that maximizes the utility of the released data while providing preventive protection of confidential data. Extensive computational results show that the proposed method is practical and viable and that useful data can be released even when the level of risk in the data is high.