A high-level programming environment for packet trace anonymization and transformation

  • Authors:
  • Ruoming Pang;Vern Paxson

  • Affiliations:
  • Princeton University;International Computer Science Institute

  • Venue:
  • Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Packet traces of operational Internet traffic are invaluable to network research, but public sharing of such traces is severely limited by the need to first remove all sensitive information. Current trace anonymization technology leaves only the packet headers intact, completely stripping the contents; to our knowledge, there are no publicly available traces of any significant size that contain packet payloads. We describe a new approach to transform and anonymize packet traces. Our tool provides high-level language support for packet transformation, allowing the user to write short policy scripts to express sophisticated trace transformations. The resulting scripts can anonymize both packet headers and payloads, and can perform application-level transformations such as editing HTTP or SMTP headers, replacing the content of Web items with MD5 hashes, or altering filenames or reply codes that match given patterns. We discuss the critical issue of verifying that anonymizations are both correctly applied and correctly specified, and experiences with anonymizing FTP traces from the Lawrence Berkeley National Laboratory for public release.