A Hybrid Approach to Private Record Linkage

Authors:
Ali Inan;Murat Kantarcioglu;Elisa Bertino;Monica Scannapieco
Affiliations:
Department of Computer Science, The University of Texas at Dallas, Richardson, TX 75083, USA. inan@student.utdallas.edu;Department of Computer Science, The University of Texas at Dallas, Richardson, TX 75083, USA. muratk@utdallas.edu;Department of Computer Sciences, Purdue University, West Lafayette, IN 47907, USA. bertino@cs.purdue.edu;Dipartimento di Informatica e Sistemistica, Universita di Roma "La Sapienza", Roma 00198, Italy. monscan@dis.uniroma1.it
Venue:
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Year:
2008

Citing 0
Cited 24

The Challenge of Assuring Data Trustworthiness

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Geocode Matching and Privacy Preservation

Privacy, Security, and Trust in KDD
Formal anonymity models for efficient privacy-preserving joins

Data & Knowledge Engineering
Assessing the trustworthiness of location data based on provenance

Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Private record matching using differential privacy

Proceedings of the 13th International Conference on Extending Database Technology
Privacy-preserving matching of spatial datasets with protection against background knowledge

Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems
Privacy-preserving record linkage

PSD'10 Proceedings of the 2010 international conference on Privacy in statistical databases
A constraint satisfaction cryptanalysis of bloom filters in private record linkage

PETS'11 Proceedings of the 11th international conference on Privacy enhancing technologies
Privacy preserving group linkage

SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Introduction to linked data and its lifecycle on the web

RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Anonymity meets game theory: secure data integration with malicious participants

The VLDB Journal — The International Journal on Very Large Data Bases
Information fusion in data privacy: A survey

Information Fusion
Fake injection strategies for private phonetic matching

DPM'11 Proceedings of the 6th international conference, and 4th international conference on Data Privacy Management and Autonomous Spontaneus Security
Reference table based k-anonymous private blocking

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Efficient and Practical Approach for Private Record Linkage

Journal of Data and Information Quality (JDIQ)
Frequent grams based embedding for privacy preserving record linkage

Proceedings of the 21st ACM international conference on Information and knowledge management
Efficient privacy-aware record integration

Proceedings of the 16th International Conference on Extending Database Technology
Crafting a balance between big data utility and protection in the semantic data cloud

Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
A taxonomy of privacy-preserving record linkage techniques

Information Systems
An efficient two-party protocol for approximate matching in private record linkage

AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Efficient two-party private blocking based on sorted nearest neighborhood clustering

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
An iterative two-party protocol for scalable privacy-preserving record linkage

AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
Introduction to linked data and its lifecycle on the web

RW'13 Proceedings of the 9th international conference on Reasoning Web: semantic technologies for intelligent data access
Mining frequent patterns with differential privacy

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Real-world entities are not always represented by the same set of features in different data sets. Therefore matching and linking records corresponding to the same real-world entity distributed across these data sets is a challenging task. If the data sets contain private information, the problem becomes even harder due to privacy concerns. Existing solutions of this problem mostly follow two approaches: sanitization techniques and cryptographic techniques. The former achieves privacy by perturbing sensitive data at the expense of degrading matching accuracy. The later, on the other hand, attains both privacy and high accuracy under heavy communication and computation costs. In this paper, we propose a method that combines these two approaches and enables users to trade off between privacy, accuracy and cost. Experiments conducted on real data sets show that our method has significantly lower costs than cryptographic techniques and yields much more accurate matching results compared to sanitization techniques, even when the data sets are perturbed extensively.