Website fingerprinting: attacking popular privacy enhancing technologies with the multinomial naïve-bayes classifier

  • Authors:
  • Dominik Herrmann;Rolf Wendolsky;Hannes Federrath

  • Affiliations:
  • University of Regensburg, Regensburg, Germany;JonDos GmbH, Regensburg, Germany;University of Regensburg, Regensburg, Germany

  • Venue:
  • Proceedings of the 2009 ACM workshop on Cloud computing security
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Privacy enhancing technologies like OpenSSL, OpenVPN or Tor establish an encrypted tunnel that enables users to hide content and addresses of requested websites from external observers This protection is endangered by local traffic analysis attacks that allow an external, passive attacker between the PET system and the user to uncover the identity of the requested sites. However, existing proposals for such attacks are not practicable yet. We present a novel method that applies common text mining techniques to the normalised frequency distribution of observable IP packet sizes. Our classifier correctly identifies up to 97% of requests on a sample of 775 sites and over 300,000 real-world traffic dumps recorded over a two-month period. It outperforms previously known methods like Jaccard's classifier and Naïve Bayes that neglect packet frequencies altogether or rely on absolute frequency values, respectively. Our method is system-agnostic: it can be used against any PET without alteration. Closed-world results indicate that many popular single-hop and even multi-hop systems like Tor and JonDonym are vulnerable against this general fingerprinting attack. Furthermore, we discuss important real-world issues, namely false alarms and the influence of the browser cache on accuracy.