Secure computation with horizontally partitioned data using adaptive regression splines

  • Authors:
  • Joyee Ghosh;Jerome P. Reiter;Alan F. Karr

  • Affiliations:
  • Duke University, Durham, NC, USA;Duke University, Durham, NC, USA;National Institute of Statistical Sciences, Research Triangle Park, NC, USA

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2007

Quantified Score

Hi-index 0.03

Visualization

Abstract

When several data owners possess data on different records but the same variables, known as horizontally partitioned data, the owners can improve statistical inferences by sharing their data with each other. Often, however, the owners are unwilling or unable to share because the data are confidential or proprietary. Secure computation protocols enable the owners to compute parameter estimates for some statistical models, including linear regressions, without sharing individual records' data. A drawback to these techniques is that the model must be specified in advance of initiating the protocol, and the usual exploratory strategies for determining good-fitting models have limited usefulness since the individual records are not shared. In this paper, we present a protocol for secure adaptive regression splines that allows for flexible, semi-automatic regression modeling. This reduces the risk of model mis-specification inherent in secure computation settings. We illustrate the protocol with air pollution data.