Extracting user web browsing patterns from non-content network traces: The online advertising case study

Authors:
Gabriel Maciá-Fernández;Yong Wang;Rafael A. Rodrıguez-Gómez;Aleksandar Kuzmanovic
Affiliations:
University of Granada, Dept. Signal Theory, Telematics and Communications, CITIC, Spain;University of Electronic Science and Technology of China, Chengdu, China;University of Granada, Dept. Signal Theory, Telematics and Communications, CITIC, Spain;Northwestern University, Evanston, Illinois, USA
Venue:
Computer Networks: The International Journal of Computer and Telecommunications Networking
Year:
2012

Citing 14
Cited 0

Self-similarity in World Wide Web traffic: evidence and possible causes

IEEE/ACM Transactions on Networking (TON)
Timing attacks on Web privacy

Proceedings of the 7th ACM conference on Computer and communications security
What TCP/IP protocol headers can tell us about the web

Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Statistical Identification of Encrypted Web Browsing Traffic

SP '02 Proceedings of the 2002 IEEE Symposium on Security and Privacy
An Empirical Model of HTTP Network Traffic

INFOCOM '97 Proceedings of the INFOCOM '97. Sixteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Driving the Information Revolution
Optimizing web search using web click-through data

Proceedings of the thirteenth ACM international conference on Information and knowledge management
BLINC: multilevel traffic classification in the dark

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Inferring the source of encrypted HTTP connections

Proceedings of the 13th ACM conference on Computer and communications security
On Inferring Application Protocol Behaviors in Encrypted Network Traffic

The Journal of Machine Learning Research
Timing analysis of keystrokes and timing attacks on SSH

SSYM'01 Proceedings of the 10th conference on USENIX Security Symposium - Volume 10
iPlane: an information plane for distributed services

OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Legal issues surrounding monitoring during network research

Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Privacy diffusion on the web: a longitudinal perspective

Proceedings of the 18th international conference on World wide web
Fingerprinting websites using traffic analysis

PET'02 Proceedings of the 2nd international conference on Privacy enhancing technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Online advertising is a rapidly growing industry currently dominated by the search engine 'giant'Google. In an attempt to tap into this huge market, Internet Service Providers (ISPs) started deploying deep packet inspection techniques to track and collect user browsing behavior. However, these providers have the fear that such techniques violate wiretap laws that explicitly prevent intercepting the contents of communication without gaining consent from consumers. In this paper, we explore how it is possible for ISPs to extract user browsing patterns without inspecting contents of communication. Our contributions are threefold. First, we develop a methodology and implement a system that is capable of extracting web browsing features from stored non-content based network traces, which could be legally shared. When such browsing features are correlated with information collected by independently crawling the Web, it becomes possible to recover the actual web pages accessed by clients. Second, we evaluate our system on the Internet and check that it can successfully recover user browsing patterns with high accuracy.