Valid Statistical Analysis for Logistic Regression with Multiple Sources

  • Authors:
  • Stephen E. Fienberg;Yuval Nardi;Aleksandra B. Slavković

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh 15213;Carnegie Mellon University, Pittsburgh 15213;Pennsylvania State University 16802

  • Venue:
  • Protecting Persons While Protecting the People
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Considerable effort has gone into understanding issues of privacy protection of individual information in single databases, and various solutions have been proposed depending on the nature of the data, the ways in which the database will be used and the precise nature of the privacy protection being offered. Once data are merged across sources, however, the nature of the problem becomes far more complex and a number of privacy issues arise for the linked individual files that go well beyond those that are considered with regard to the data within individual sources. In the paper, we propose an approach that gives full statistical analysis on the combined database without actually combining it. We focus mainly on logistic regression, but the method and tools described may be applied essentially to other statistical models as well.