Reconstructing Data Perturbed by Random Projections When the Mixing Matrix Is Known

Authors:
Yingpeng Sang;Hong Shen;Hui Tian
Affiliations:
School of Computer Science, The University of Adelaide, Australia 5005;School of Computer Science, The University of Adelaide, Australia 5005;School of Mathematical Science, The University of Adelaide, Australia 5005
Venue:
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Year:
2009

Citing 27
Cited 1

A data distortion by probability distribution

ACM Transactions on Database Systems (TODS)
Security-control methods for statistical databases: a comparative study

ACM Computing Surveys (CSUR)
Privacy-preserving data mining

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Independent component analysis: algorithms and applications

Neural Networks
Atomic Decomposition by Basis Pursuit

SIAM Review
Using unknowns to prevent discovery of association rules

ACM SIGMOD Record
An Analytic Approach to Statistical Databases

VLDB '83 Proceedings of the 9th International Conference on Very Large Data Bases
Privacy Preserving Data Mining

CRYPTO '00 Proceedings of the 20th Annual International Cryptology Conference on Advances in Cryptology
Limiting privacy breaches in privacy preserving data mining

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
k-anonymity: a model for protecting privacy

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Disclosure Limitation of Sensitive Rules

KDEX '99 Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange
On the Privacy Preserving Properties of Random Data Perturbation Techniques

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Association Rule Hiding

IEEE Transactions on Knowledge and Data Engineering
Foundations of Cryptography: Volume 2, Basic Applications

Foundations of Cryptography: Volume 2, Basic Applications
Privacy-Preserving Distributed Mining of Association Rules on Horizontally Partitioned Data

IEEE Transactions on Knowledge and Data Engineering
A Framework for High-Accuracy Privacy-Preserving Mining

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Deriving private information from randomized data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Random Projection-Based Multiplicative Data Perturbation for Privacy Preserving Distributed Data Mining

IEEE Transactions on Knowledge and Data Engineering
Blind Source Separation by Sparse Decomposition in a Signal Dictionary

Neural Computation
\ell -Diversity: Privacy Beyond \kappa -Anonymity

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Maintaining data privacy in association rule mining

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Privacy-Preserving Data Mining: Models and Algorithms

Privacy-Preserving Data Mining: Models and Algorithms
Disclosure Risks of Distance Preserving Data Transformations

SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Deriving private information from arbitrarily projected data

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
An attacker's view of distance preserving maps for privacy preserving data mining

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Privacy preserving clustering

ESORICS'05 Proceedings of the 10th European conference on Research in Computer Security
General approach to blind source separation

IEEE Transactions on Signal Processing

On the use of decentralization to enable privacy in web-scale recommendation services

Proceedings of the 12th ACM workshop on Workshop on privacy in the electronic society

Quantified Score

Hi-index	0.00

Visualization

Abstract

Random Projection ($\mathcal{RP}$) has drawn great interest from the research of privacy-preserving data mining due to its high efficiency and security. It was proposed in [27] where the original data set composed of m attributes, is multiplied with a mixing matrix of dimensions k ×m (m k ) which is random and orthogonal on expectation, and then the k series of perturbed data are released for mining purposes. To our knowledge little work has been done from the view of the attacker, to reconstruct the original data to get some sensitive information, given the data perturbed by $\mathcal{RP}$ and some priori knowledge, e.g. the mixing matrix, the means and variances of the original data. In the case that the attributes of the original data are mutually independent and sparse, the reconstruction can be treated as a problem of Underdetermined Independent Component Analysis (UICA), but UICA has some permutation and scaling ambiguities. In this paper we propose a reconstruction framework based on UICA and also some techniques to reduce the ambiguities. The cases that the attributes of the original data are correlated and not sparse are also common in data mining. We also propose a reconstruction method for the typical case of Multivariate Gaussian Distribution, based on the method of Maximum A Posterior (MAP). Our experiments show that our reconstructions can achieve high recovery rates, and outperform the reconstructions based on Principle Component Analysis (PCA).