Generalizing data to provide anonymity when disclosing information (abstract)
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Analysis of a very large web search engine query log
ACM SIGIR Forum
Real life, real users, and real needs: a study and analysis of user queries on the web
Information Processing and Management: an International Journal
Journal of the American Society for Information Science
Characteristics of question format web queries: an exploratory study
Information Processing and Management: an International Journal
Information Processing and Management: an International Journal
ACM SIGIR Forum
A high-level programming environment for packet trace anonymization and transformation
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Optimizing result prefetching in web search engines with segmented indices
ACM Transactions on Internet Technology (TOIT)
Distributional clustering of English words
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Understanding user goals in web search
Proceedings of the 13th international conference on World Wide Web
Proceedings of the 13th international conference on World Wide Web
Measures of distributional similarity
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Algorithm Design
Data Privacy through Optimal k-Anonymization
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
On the complexity of optimal K-anonymity
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Privacy-enhancing k-anonymization of customer data
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Extracting paraphrases from a parallel corpus
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Generating query substitutions
Proceedings of the 15th international conference on World Wide Web
"I know what you did last summer": query logs and user privacy
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Anonymizing transaction databases for publication
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A survey of query log privacy-enhancing techniques from a policy perspective
ACM Transactions on the Web (TWEB)
Vanity fair: privacy in querylog bundles
Proceedings of the 17th ACM conference on Information and knowledge management
Releasing search queries and clicks privately
Proceedings of the 18th international conference on World wide web
Attacks on privacy and deFinetti's theorem
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Tree-Based Microaggregation for the Anonymization of Search Logs
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Privacy-Preserving Data Publishing
Foundations and Trends in Databases
Effective anonymization of query logs
Proceedings of the 18th ACM conference on Information and knowledge management
Applying differential privacy to search queries in a policy based interactive framework
Proceedings of the ACM first international workshop on Privacy and anonymity for very large databases
Anonymization of set-valued data via top-down, local generalization
Proceedings of the VLDB Endowment
FM '09 Proceedings of the 2nd World Congress on Formal Methods
Privacy-preserving data publishing: A survey of recent developments
ACM Computing Surveys (CSUR)
Anonymizing user profiles for personalized web search
Proceedings of the 19th international conference on World wide web
Website privacy preservation for query log publishing
PinKDD'07 Proceedings of the 1st ACM SIGKDD international conference on Privacy, security, and trust in KDD
Mining Query Logs: Turning Search Usage Data into Knowledge
Foundations and Trends in Information Retrieval
Privacy-preserving query log mining for business confidentiality protection
ACM Transactions on the Web (TWEB)
Towards an axiomatization of statistical privacy and utility
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A dynamic privacy model for web services
Computer Standards & Interfaces
A firm foundation for private data analysis
Communications of the ACM
Foundations and Trends in Information Retrieval
Enhancing deniability against query-logs
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Transactions on Data Privacy
Communications of the ACM
User k-anonymity for privacy preserving data mining of query logs
Information Processing and Management: an International Journal
Differentially private search log sanitization with optimal output utility
Proceedings of the 15th International Conference on Extending Database Technology
Semantic search log k-anonymization with generalized k-cores of query concept graph
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Processing analytical queries over encrypted data
Proceedings of the VLDB Endowment
Privacy-enhanced string matching with wordwise positional sampling
Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
A query scrambler for search privacy on the internet
Information Retrieval
Shroud: ensuring private access to large-scale data in the data center
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Hi-index | 0.03 |
In this paper we study the privacy preservation properties of aspecific technique for query log anonymization: token-based hashing. In this approach, each query is tokenized, and then a secure hash function is applied to each token. We show that statistical techniques may be applied to partially compromise the anonymization. We then analyze the specific risks that arise from these partial compromises, focused on revelation of identity from unambiguous names, addresses, and so forth, and the revelation of facts associated with an identity that are deemed to be highly sensitive. Our goal in this work is two fold: to show that token-based hashing is unsuitable for anonymization, and to present a concrete analysis of specific techniques that may be effective in breaching privacy, against which other anonymization schemes should be measured.