P2P traffic classification using ensemble learning

  • Authors:
  • Jagan Mohan Reddy;Chittaranjan Hota

  • Affiliations:
  • Birla Institute of Technology and Science-Pilani, A.P., India;Birla Institute of Technology and Science-Pilani, A.P., India

  • Venue:
  • Proceedings of the 5th IBM Collaborative Academia Research Exchange Workshop
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Early Peer-to-Peer overlay network traffic classification schemes were based on port-based and payload based inspection. In recent years researchers have focused on alternate machine learning approaches. This paper presents ensemble learning which combines multiple models to improve prediction accuracy over a single classifier or semi-supervised learning techniques. In this paper, statistical characteristics of TCP and UDP flows are extracted from the network traces to construct a feature set first. We then apply feature selection techniques to reduce the number of features required to train the model, hence reducing the build time. We used Stacking and Voting ensemble learning techniques to improve prediction accuracy with base classifiers modelled using Machine Learning (ML) algorithms: Naïve Bayes classifier, Bayesian Network, Decision trees. We used meta classifiers to further improve classification accuracy to 99.9%. Our experimental results show that Stacking perform better over Voting in identifying P2P traffic.