Mining e-mail content for author identification forensics
ACM SIGMOD Record
IEEE Intelligent Systems
Replication is not needed: single database, computationally-private information retrieval
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Style mining of electronic messages for multiple authorship discrimination: first results
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Categorizing web queries according to geographical locality
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Proceedings of the 13th international conference on World Wide Web
In Defense of One-Vs-All Classification
The Journal of Machine Learning Research
Journal of the American Society for Information Science and Technology
Authorship attribution with thousands of candidate authors
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Tor: the second-generation onion router
SSYM'04 Proceedings of the 13th conference on USENIX Security Symposium - Volume 13
Information re-retrieval: repeat queries in Yahoo's logs
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 2007 ACM workshop on Privacy in electronic society
"I know what you did last summer": query logs and user privacy
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Vanity fair: privacy in querylog bundles
Proceedings of the 17th ACM conference on Information and knowledge management
Context-aware query classification
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Noise Injection for Search Privacy Protection
CSE '09 Proceedings of the 2009 International Conference on Computational Science and Engineering - Volume 03
Faking contextual data for fun, profit, and privacy
Proceedings of the 8th ACM workshop on Privacy in the electronic society
Anatomy of the long tail: ordinary people with extraordinary tastes
Proceedings of the third ACM international conference on Web search and data mining
On the privacy of web search based on query obfuscation: a case study of TrackMeNot
PETS'10 Proceedings of the 10th international conference on Privacy enhancing technologies
Private information disclosure from web searches
PETS'10 Proceedings of the 10th international conference on Privacy enhancing technologies
On the effectiveness of anonymizing networks for web search privacy
Proceedings of the 6th ACM Symposium on Information, Computer and Communications Security
Hi-index | 0.00 |
Web Search is one of the most rapidly growing applications on the internet today. However, the current practice followed by most search engines --of logging and analyzing users' queries --raises serious privacy concerns. In this paper, we concentrate on two existing solutions which are relatively easy to deploy --namely Query Obfuscation and Anonymizing Networks. In query obfuscation, a client-side software attempts to mask real user queries via injection of certain noisy queries. Anonymizing networks route the user queries through a series of relay servers, hiding the actual query source from the search engine. A fundamental problem with these solutions, however, is that user queries are still obviously revealed to the search engine, although they are “mixed” among queries generated either by a machine or by other users. We focus on TrackMeNot TMN, a popular query obfuscation tool, and the Tor anonymizing network, and try to analyse whether these solutions can actually preserve users' privacy in practice against an adversarial search engine. We demonstrate that a search engine, equipped with only a short-term history of a user's search queries, can break the privacy guarantees of TMN and Tor by only utilizing off-the-shelf machine learning techniques.