Feature selection for optimizing traffic classification

  • Authors:
  • Hongli Zhang;Gang Lu;Mahmoud T. Qassrawi;Yu Zhang;Xiangzhan Yu

  • Affiliations:
  • School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, PR China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, PR China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, PR China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, PR China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, PR China

  • Venue:
  • Computer Communications
  • Year:
  • 2012

Quantified Score

Hi-index 0.24

Visualization

Abstract

Machine learning (ML) algorithms have been widely applied in recent traffic classification. However, due to the imbalance in the number of traffic flows, ML based classifiers are prone to misclassify flows as the traffic type that occupies the majority of flows on the Internet. To address the problem, a novel feature selection metric named Weighted Symmetrical Uncertainty (WSU) is proposed. We design a hybrid feature selection algorithm named WSU_AUC, which prefilters most of features with WSU metric and further uses a wrapper method to select features for a specific classifier with Area Under roc Curve (AUC) metric. Additionally, to overcome the impacts of dynamic traffic flows on feature selection, we propose an algorithm named SRSF that Selects the Robust and Stable Features from the results achieved by WSU_AUC. We evaluate our approaches using three classifiers on the traces captured from entirely different networks. Experimental results obtained by our algorithms are promising in terms of true positive rate (TPR) and false positive rate (FPR). Moreover, our algorithms can achieve 94% flow accuracy and 80% byte accuracy on average.