Differential privacy in data publication and analysis
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Low-rank mechanism: optimizing batch queries under differential privacy
Proceedings of the VLDB Endowment
Functional mechanism: regression analysis under differential privacy
Proceedings of the VLDB Endowment
Real-time aggregate monitoring with differential privacy
Proceedings of the 21st ACM international conference on Information and knowledge management
Differentially private projected histograms: construction and use for prediction
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
PrivGene: differentially private model fitting using genetic algorithms
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Practical differential privacy via grouping and smoothing
Proceedings of the VLDB Endowment
UMicS: from anonymized data to usable microdata
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Differentially private multi-dimensional time series release for traffic monitoring
DBSec'13 Proceedings of the 27th international conference on Data and Applications Security and Privacy XXVII
Understanding hierarchical methods for differentially private histograms
Proceedings of the VLDB Endowment
Monitoring web browsing behavior with differential privacy
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
Differential privacy (DP) is a promising scheme for releasing the results of statistical queries on sensitive data, with strong privacy guarantees against adversaries with arbitrary background knowledge. Existing studies on DP mostly focus on simple aggregations such as counts. This paper investigates the publication of DP-compliant histograms, which is an important analytical tool for showing the distribution of a random variable, e.g., hospital bill size for certain patients. Compared to simple aggregations whose results are purely numerical, a histogram query is inherently more complex, since it must also determine its structure, i.e., the ranges of the bins. As we demonstrate in the paper, a DP-compliant histogram with finer bins may actually lead to significantly lower accuracy than a coarser one, since the former requires stronger perturbations in order to satisfy DP. Moreover, the histogram structure itself may reveal sensitive information, which further complicates the problem. Motivated by this, we propose two novel algorithms, namely Noise First and Structure First, for computing DP-compliant histograms. Their main difference lies in the relative order of the noise injection and the histogram structure computation steps. Noise First has the additional benefit that it can improve the accuracy of an already published DP-complaint histogram computed using a naiive method. Going one step further, we extend both solutions to answer arbitrary range queries. Extensive experiments, using several real data sets, confirm that the proposed methods output highly accurate query answers, and consistently outperform existing competitors.