High throughput and programmable online trafficclassifier on FPGA

  • Authors:
  • Da Tong;Lu Sun;Kiran Matam;Viktor Prasanna

  • Affiliations:
  • University of Southern California, Los Angeles, CA, USA;University of Southern California, Los Angeles, CA, USA;University of Southern California, Los Angeles, CA, USA;University of Southern California, Los Angeles, CA, USA

  • Venue:
  • Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Machine learning (ML) algorithms have been shown to be effective in classifying the dynamic internet traffic today. Using additional features and sophisticated ML techniques can improve accuracy and can classify a broad range of application classes. Realizing such classifiers to meet high data rates is challenging. In this paper, we propose two architectures to realize complete online traffic classifier using flow-level features. First, we develop a traffic classifier based on C4.5 decision tree algorithm and Entropy-MDL discretization algorithm. It achieves an accuracy of 97.92% when classifying a traffic trace consisting of eight application classes. Next, we accelerate our classifier using two architectures on FPGA. One architecture stores the classifier in on-chip distributed RAM. It is designed to sustain a high throughput. The other architecture stores the classifier in block RAM. It is designed to operate with small hardware footprint and thus built at low hardware cost. Experimental results show that our high throughput architecture can sustain a throughput of $550$ Gbps assuming 40 Byte packet size. Our low cost architecture demonstrates a 22% better resource efficiency than the high throughput design. It can be easily replicated to achieve $449$ Gbps while supporting 160 input traffic streams concurrently. Both architectures are parameterizable and programmable to support any binary-tree-based traffic classifier. We develop a tool which allows users to easily map a binary-tree-based classifier to hardware. The tool takes a classifier as input and automatically generates the Verilog code for the corresponding hardware architecture.