A Hybrid Approach to Private Record Linkage

  • Authors:
  • Ali Inan;Murat Kantarcioglu;Elisa Bertino;Monica Scannapieco

  • Affiliations:
  • Department of Computer Science, The University of Texas at Dallas, Richardson, TX 75083, USA. inan@student.utdallas.edu;Department of Computer Science, The University of Texas at Dallas, Richardson, TX 75083, USA. muratk@utdallas.edu;Department of Computer Sciences, Purdue University, West Lafayette, IN 47907, USA. bertino@cs.purdue.edu;Dipartimento di Informatica e Sistemistica, Universita di Roma "La Sapienza", Roma 00198, Italy. monscan@dis.uniroma1.it

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Real-world entities are not always represented by the same set of features in different data sets. Therefore matching and linking records corresponding to the same real-world entity distributed across these data sets is a challenging task. If the data sets contain private information, the problem becomes even harder due to privacy concerns. Existing solutions of this problem mostly follow two approaches: sanitization techniques and cryptographic techniques. The former achieves privacy by perturbing sensitive data at the expense of degrading matching accuracy. The later, on the other hand, attains both privacy and high accuracy under heavy communication and computation costs. In this paper, we propose a method that combines these two approaches and enables users to trade off between privacy, accuracy and cost. Experiments conducted on real data sets show that our method has significantly lower costs than cryptographic techniques and yields much more accurate matching results compared to sanitization techniques, even when the data sets are perturbed extensively.