Pitfalls in HTTP traffic measurements and analysis

  • Authors:
  • Fabian Schneider;Bernhard Ager;Gregor Maier;Anja Feldmann;Steve Uhlig

  • Affiliations:
  • NEC Laboratories Europe, Heidelberg, Germany and Telekom Innovation Laboratories, TU Berlin, Berlin, Germany;Telekom Innovation Laboratories, TU Berlin, Berlin, Germany;Telekom Innovation Laboratories, TU Berlin, Berlin, Germany and International Computer Science Institute, Berkeley, CA;Telekom Innovation Laboratories, TU Berlin, Berlin, Germany;Queen Mary, University of London, London, UK

  • Venue:
  • PAM'12 Proceedings of the 13th international conference on Passive and Active Measurement
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Being responsible for more than half of the total traffic volume in the Internet, HTTP is a popular subject for traffic analysis. From our experiences with HTTP traffic analysis we identified a number of pitfalls which can render a carefully executed study flawed. Often these pitfalls can be avoided easily. Based on passive traffic measurements of 20.000 European residential broadband customers, we quantify the potential error of three issues: Non-consideration of persistent or pipelined HTTP requests, mismatches between the Content-Type header field and the actual content, and mismatches between the Content-Length header and the actual transmitted volume. We find that 60% (30%) of all HTTP requests (bytes) are persistent (i.e., not the first in a TCP connection) and 4% are pipelined. Moreover, we observe a Content-Type mismatch for 35% of the total HTTP volume. In terms of Content-Length accuracy our data shows a factor of at least 3.2 more bytes reported in the HTTP header than actually transferred.