Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
Data Privacy through Optimal k-Anonymization
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Mondrian Multidimensional K-Anonymity
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
On anonymizing query logs via token-based hashing
Proceedings of the 16th international conference on World Wide Web
Frequent pattern mining: current status and future directions
Data Mining and Knowledge Discovery
"I know what you did last summer": query logs and user privacy
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A survey of query log privacy-enhancing techniques from a policy perspective
ACM Transactions on the Web (TWEB)
Releasing search queries and clicks privately
Proceedings of the 18th international conference on World wide web
Universally utility-maximizing privacy mechanisms
Proceedings of the forty-first annual ACM symposium on Theory of computing
Privacy: Theory meets Practice on the Map
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Differentially private recommender systems: building privacy into the net
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Effective anonymization of query logs
Proceedings of the 18th ACM conference on Information and knowledge management
Accurate Estimation of the Degree Distribution of Private Networks
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Anonymization of set-valued data via top-down, local generalization
Proceedings of the VLDB Endowment
Enforcing Vocabulary k-Anonymity by Semantic Similarity Based Clustering
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Differential Privacy via Wavelet Transforms
IEEE Transactions on Knowledge and Data Engineering
CrowdLogging: distributed, private, and anonymous search logging
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Our data, ourselves: privacy via distributed noise generation
EUROCRYPT'06 Proceedings of the 24th annual international conference on The Theory and Applications of Cryptographic Techniques
Calibrating noise to sensitivity in private data analysis
TCC'06 Proceedings of the Third conference on Theory of Cryptography
Non-interactive differential privacy: a survey
Proceedings of the First International Workshop on Open Data
Hi-index | 0.00 |
Web search logs contain extremely sensitive data, as evidenced by the recent AOL incident. However, storing and analyzing search logs can be very useful for many purposes (i.e. investigating human behavior). Thus, an important research question is how to privately sanitize search logs. Several search log anonymization techniques have been proposed with concrete privacy models. However, in all of these solutions, the output utility of the techniques is only evaluated rather than being maximized in any fashion. Indeed, for effective search log anonymization, it is desirable to derive the outputs with optimal utility while meeting the privacy standard. In this paper, we propose utility-maximizing sanitization based on the rigorous privacy standard of differential privacy, in the context of search logs. Specifically, we utilize optimization models to maximize the output utility of the sanitization for different applications, while ensuring that the production process satisfies differential privacy. An added benefit is that our novel randomization strategy maintains the schema integrity in the output search logs. A comprehensive evaluation on real search logs validates the approach and demonstrates its robustness and scalability.