Web search query privacy: Evaluating query obfuscation and anonymizing networks

Authors:
Sai Teja Peddinti;Nitesh Saxena
Affiliations:
Polytechnic School of Engineering, New York University, New York, USA. E-mail: psaiteja@nyu.edu;University of Alabama, Birmingham, USA. E-mail: saxena@cis.uab.edu
Venue:
Journal of Computer Security
Year:
2014

Citing 22
Cited 0

Mining e-mail content for author identification forensics

ACM SIGMOD Record
Support Vector Machines

IEEE Intelligent Systems
Replication is not needed: single database, computationally-private information retrieval

FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Style mining of electronic messages for multiple authorship discrimination: first results

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Categorizing web queries according to geographical locality

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Anti-aliasing on the web

Proceedings of the 13th international conference on World Wide Web
In Defense of One-Vs-All Classification

The Journal of Machine Learning Research
A framework for authorship identification of online messages: Writing-style features and classification techniques

Journal of the American Society for Information Science and Technology
Authorship attribution with thousands of candidate authors

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Tor: the second-generation onion router

SSYM'04 Proceedings of the 13th conference on USENIX Security Symposium - Volume 13
Information re-retrieval: repeat queries in Yahoo's logs

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Private web search

Proceedings of the 2007 ACM workshop on Privacy in electronic society
"I know what you did last summer": query logs and user privacy

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Vanity fair: privacy in querylog bundles

Proceedings of the 17th ACM conference on Information and knowledge management
Context-aware query classification

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Noise Injection for Search Privacy Protection

CSE '09 Proceedings of the 2009 International Conference on Computational Science and Engineering - Volume 03
Faking contextual data for fun, profit, and privacy

Proceedings of the 8th ACM workshop on Privacy in the electronic society
Anatomy of the long tail: ordinary people with extraordinary tastes

Proceedings of the third ACM international conference on Web search and data mining
On the privacy of web search based on query obfuscation: a case study of TrackMeNot

PETS'10 Proceedings of the 10th international conference on Privacy enhancing technologies
Private information disclosure from web searches

PETS'10 Proceedings of the 10th international conference on Privacy enhancing technologies
On the effectiveness of anonymizing networks for web search privacy

Proceedings of the 6th ACM Symposium on Information, Computer and Communications Security

Quantified Score

Hi-index	0.00

Visualization

Abstract

Web Search is one of the most rapidly growing applications on the internet today. However, the current practice followed by most search engines --of logging and analyzing users' queries --raises serious privacy concerns. In this paper, we concentrate on two existing solutions which are relatively easy to deploy --namely Query Obfuscation and Anonymizing Networks. In query obfuscation, a client-side software attempts to mask real user queries via injection of certain noisy queries. Anonymizing networks route the user queries through a series of relay servers, hiding the actual query source from the search engine. A fundamental problem with these solutions, however, is that user queries are still obviously revealed to the search engine, although they are “mixed” among queries generated either by a machine or by other users. We focus on TrackMeNot TMN, a popular query obfuscation tool, and the Tor anonymizing network, and try to analyse whether these solutions can actually preserve users' privacy in practice against an adversarial search engine. We demonstrate that a search engine, equipped with only a short-term history of a user's search queries, can break the privacy guarantees of TMN and Tor by only utilizing off-the-shelf machine learning techniques.