Analysis of a very large web search engine query log
ACM SIGIR Forum
Characteristics of question format web queries: an exploratory study
Information Processing and Management: an International Journal
Information Processing and Management: an International Journal
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
ACM SIGIR Forum
A high-level programming environment for packet trace anonymization and transformation
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Understanding user goals in web search
Proceedings of the 13th international conference on World Wide Web
On the complexity of optimal K-anonymity
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On anonymizing query logs via token-based hashing
Proceedings of the 16th international conference on World Wide Web
Design and Semantics of a Decentralized Authorization Language
CSF '07 Proceedings of the 20th IEEE Computer Security Foundations Symposium
A survey of query log privacy-enhancing techniques from a policy perspective
ACM Transactions on the Web (TWEB)
Privacy integrated queries: an extensible platform for privacy-preserving data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II
Hi-index | 0.00 |
Web search logs are of growing importance to researchers as they help understanding search behavior and search engine performance. However, search logs typically contain sensitive information about users and therefore considerable caution must be exercised when considering releasing the logs to the research community. Current approaches to releasing search logs focus on either protecting the privacy of users or enhancing the utility of data to researchers. In this work, we address the privacy-utility tradeoff by providing safe access to search logs, instead of releasing them. We propose a policy based safe interactive framework built on semantic policies and differential privacy to allow researchers access to search logs, while maintaining the privacy of the users. Semantic policies are used to infer the higher levels of information that can be mined from a dataset based on the fields accessed by a researcher. The accessed fields are then used to build research profile(s) that guide the amount of privacy to be enforced using differential privacy. We show the additional utility that can be obtained in our framework by two demonstrative experiments that involve access to user level information. Our results indicate that valid research can be conducted in our framework without forgoing the privacy of individuals.