Communications of the ACM
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Communications of the ACM
A method for obtaining digital signatures and public-key cryptosystems
Communications of the ACM
The Design of Rijndael
k-anonymity: a model for protecting privacy
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Low-Cost Traffic Analysis of Tor
SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
A temporal comparison of AltaVista Web searching: Research Articles
Journal of the American Society for Information Science and Technology
A large-scale analysis of query logs for assessing personalization opportunities
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning a spelling error model from search query logs
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Weakly-supervised discovery of named entities using web search queries
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Proceedings of the 17th ACM conference on Information and knowledge management
Releasing search queries and clicks privately
Proceedings of the 18th international conference on World wide web
Information Sciences: an International Journal
Effective anonymization of query logs
Proceedings of the 18th ACM conference on Information and knowledge management
How are we searching the World Wide Web? A comparison of nine search engine transaction logs
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part II
Differentially private search log sanitization with optimal output utility
Proceedings of the 15th International Conference on Extending Database Technology
Semantic search log k-anonymization with generalized k-cores of query concept graph
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Rank-energy selective query forwarding for distributed search systems
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Using CrowdLogger for in situ information retrieval system evaluation
Proceedings of the 2013 workshop on Living labs for information retrieval evaluation
Hi-index | 0.00 |
We describe CrowdLogging, an approach for distributed search log collection, storage, and mining, with the dual goals of preserving privacy and making the mined information broadly available. Most search log mining approaches and most privacy enhancing schemes have focused on centralized search logs and methods for disseminating them to third parties. In our approach, a user's search log is encrypted and shared in such a way that (a) the source of a search behavior artifact, such as a query, is unknown and (b) extremely rare artifacts---that is, artifacts more likely to contain private information---are not revealed. The approach works with any search behavior artifact that can be extracted from a search log, including queries, query reformulations, and query-click pairs. In this work, we: (1) present a distributed search log collection, storage, and mining framework; (2) compare several privacy policies, including differential privacy, showing the trade-offs between strong guarantees and the utility of the released data; (3) demonstrate the impact of our approach using two existing research query logs; and (4) describe a pilot study for which we implemented a version of the framework.