Slimming down Deep packet inspection systems

  • Authors:
  • Stênio Fernandes;Rafael Antonello;Thiago Lacerda;Alysson Santos;Djamel Sadok;Tord Westholm

  • Affiliations:
  • SITE - University of Ottawa, Ottawa, Canada and CEFET-AL, Maceio, Brazil;Federal University of Pernambuco, Recife, Brazil;Federal University of Pernambuco, Recife, Brazil;Federal University of Pernambuco, Recife, Brazil;Federal University of Pernambuco, Recife, Brazil;Ericsson Research, Stockholm, Sweden

  • Venue:
  • INFOCOM'09 Proceedings of the 28th IEEE international conference on Computer Communications Workshops
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Internet Service Providers (ISP) have been recently relying on Deep Packet Inspection (DPI) systems, which are the most accurate techniques for traffic identification and classification. However, building high performance DPI systems requires an in-depth and careful computing system design due to the memory and processing power demands. DPI's accuracy mostly depends on string matching process and regular expression heuristics that go deep down on the packet payloads in a search for networked application signatures. As ISPs backbone links' speed and data volume soar, commodity hardware-based DPI systems start to face performance bottlenecks (e.g., packet losses), which interferes on traffic classification accuracy dramatically. In this paper we propose a lightweight DPI (LW-DPI) system that overcomes performance bottlenecks of traditional DPI systems, without a significant decrease on accuracy. We evaluate LW-DPI's accuracy by inspecting two factors: a limited number of full-payload packets in a given flow or a fraction of the packet payload. Our experiments were performed using more than 6TB of packet-level data from a large ISP and show that there is some interesting trade-offs between such factors and accuracy. Most flows can be classified with only their first 7 packets or a fraction of their payload. We also show that the impact on DPI's processing time may decrease around 75% as compared to analyzing all full-payload packets in a flow.