A data distortion by probability distribution
ACM Transactions on Database Systems (TODS)
Security-control methods for statistical databases: a comparative study
ACM Computing Surveys (CSUR)
An Algorithm for Finding Best Matches in Logarithmic Expected Time
ACM Transactions on Mathematical Software (TOMS)
Record linkage: making maximum use of the discriminating power of identifying information
Communications of the ACM
The statistical security of a statistical database
ACM Transactions on Database Systems (TODS)
Machine Learning
Protecting Respondents' Identities in Microdata Release
IEEE Transactions on Knowledge and Data Engineering
Practical Data-Oriented Microaggregation for Statistical Disclosure Control
IEEE Transactions on Knowledge and Data Engineering
Mathematical Programming for Data Mining: Formulations and Challenges
INFORMS Journal on Computing
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
A Polynomial Algorithm for Optimal Univariate Microaggregation
IEEE Transactions on Knowledge and Data Engineering
Impacts of user privacy preferences on personalized systems: a comparative study
Designing personalized user experiences in eCommerce
Privacy-preserving data linkage protocols
Proceedings of the 2004 ACM workshop on Privacy in the electronic society
Minimum Spanning Tree Partitioning Algorithm for Microaggregation
IEEE Transactions on Knowledge and Data Engineering
Blocking-aware private record linkage
Proceedings of the 2nd international workshop on Information quality in information systems
ACM SIGKDD Explorations Newsletter
\ell -Diversity: Privacy Beyond \kappa -Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
A Tree-Based Data Perturbation Approach for Privacy-Preserving Data Mining
IEEE Transactions on Knowledge and Data Engineering
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Fast principal component analysis using fixed-point algorithm
Pattern Recognition Letters
Privacy Protection in Data Mining: A Perturbation Approach for Categorical Data
Information Systems Research
Data Mining and Homeland Security: An Overview
Data Mining and Homeland Security: An Overview
Developing privacy solutions for sharing and analysing healthcare data
International Journal of Business Information Systems
Pricing and disseminating customer data with privacy awareness
Decision Support Systems
Hi-index | 0.00 |
Record linkage techniques have been widely used in areas such as antiterrorism, crime analysis, epidemiologic research, and database marketing. On the other hand, such techniques are also being increasingly used for identity matching that leads to the disclosure of private information. These techniques can be used to effectively reidentify records even in deidentified data. Consequently, the use of such techniques can lead to individual privacy being severely eroded. Our study addresses this important issue and provides a solution to resolve the conflict between privacy protection and data utility. We propose a data-masking method for protecting private information against record linkage disclosure that preserves the statistical properties of the data for legitimate analysis. Our method recursively partitions a data set into smaller subsets such that data records within each subset are more homogeneous after each partition. The partition is made orthogonal to the maximum variance dimension represented by the first principal component in each partitioned set. The attribute values of a record in a subset are then masked using a double-bounded swapping method. The proposed method, which we call multivariate swapping trees, is nonparametric in nature and does not require any assumptions about statistical distributions of the original data. Experiments conducted on real-world data sets demonstrate that the proposed approach significantly outperforms existing methods in terms of both preventing identity disclosure and preserving data quality.