The scientist and engineer's guide to digital signal processing
The scientist and engineer's guide to digital signal processing
Efficient clustering of high-dimensional data sets with application to reference matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem
Data Mining and Knowledge Discovery
Tools for privacy preserving distributed data mining
ACM SIGKDD Explorations Newsletter
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Properties of Embedding Methods for Similarity Searching in Metric Spaces
IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Record Linkage in Large Data Sets
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
Privacy-Preserving Cooperative Statistical Analysis
ACSAC '01 Proceedings of the 17th Annual Computer Security Applications Conference
Information sharing across private databases
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Secure and private sequence comparisons
Proceedings of the 2003 ACM workshop on Privacy in the electronic society
Blocking-aware private record linkage
Proceedings of the 2nd international workshop on Information quality in information systems
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Privacy Preserving Query Processing Using Third Parties
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Record linkage: similarity measures and algorithms
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Privacy preserving schema and data matching
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Data Quality Assessment
A Hybrid Approach to Private Record Linkage
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Efficient Private Record Linkage
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Privacy Preserving Record Linkage Using Phonetic Codes
BCI '09 Proceedings of the 2009 Fourth Balkan Conference in Informatics
Private record matching using differential privacy
Proceedings of the 13th International Conference on Extending Database Technology
Public-key cryptosystems based on composite degree residuosity classes
EUROCRYPT'99 Proceedings of the 17th international conference on Theory and application of cryptographic techniques
On private scalar product computation for privacy-preserving data mining
ICISC'04 Proceedings of the 7th international conference on Information Security and Cryptology
A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication
IEEE Transactions on Knowledge and Data Engineering
An iterative two-party protocol for scalable privacy-preserving record linkage
AusDM '12 Proceedings of the Tenth Australasian Data Mining Conference - Volume 134
Hi-index | 0.00 |
Record linkage is used to associate entities from multiple data sources. For example, two organizations contemplating a merger may want to know how common their customer bases are so that they may better assess the benefits of the merger. Another example is a database of people who are forbidden from a certain activity by regulators, may need to be compared to a list of people engaged in that activity. The autonomous entities who wish to carry out the record matching computation are often reluctant to fully share their data; they fear losing control over its subsequent dissemination and usage, or they want to insure privacy because the data is proprietary or confidential, and/or they are cautious simply because privacy laws forbid its disclosure or regulate the form of that disclosure. In such cases, the problem of carrying out the linkage computation without full data exchange has been called private record linkage. Previous private record linkage techniques have made use of a third party. We provide efficient techniques for private record linkage that improve on previous work in that (1) our techniques make no use of a third party, and (2) they achieve much better performance than previous schemes in terms of their execution time while maintaining acceptable quality of output compared to nonprivacy settings. Our protocol consists of two phases. The first phase primarily produces candidate record pairs for matching, by carrying out a very fast (but not accurate) matching between such pairs of records. The second phase is a novel protocol for efficiently computing distances between each candidate pair (without any expensive cryptographic operations such as modular exponentiations). Our experimental evaluation of our approach validates these claims.