Session-based classification of internet applications in 3G wireless networks

  • Authors:
  • Seongjin Lee;Jongwoo Song;Soohan Ahn;Youjip Won

  • Affiliations:
  • Dept. of Electronics and Computer Engineering, Hanyang University, Republic of Korea;Dept. of Statistics, Ewha Womans University, Republic of Korea;Dept. of Statistics, University of Seoul, Republic of Korea;Dept. of Electronics and Computer Engineering, Hanyang University, Republic of Korea

  • Venue:
  • Computer Networks: The International Journal of Computer and Telecommunications Networking
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Accurately classifying and identifying wireless network traffic associated with various applications, such as Web, VoIP, and VoD, is a challenge for both service providers and network operators. Traditional classification schemes exploiting port or payload analysis are becoming ineffective in actual networks, as many new applications are emerging. This paper presents the classification of HSDPA network traffic applications using Classification and Regression Tree (CART) and Support Vector Machine (SVM) with the session information as a basic measure. The session is bidirectional traffic stream between two hosts that is used as a basic measure and a unit of information. We acquired and processed HSDPA traffic from a real 3G network without sanitizing the data. CART and SVM are used to classify six application groups (download, game, upload, VoD, VoiP, and web) with a set of twelve easily retrievable features. These features are composed of simple statistical pieces of information, such as the standard deviation of the packet sizes, the number of packets, and the duration of a session. Compared to results of a flow-based application classification, session-based classification produces 11.07% (CART) and 21.99% (SVM) increases in the true positive rate. This feature set is further reduced to two principal components using Principal Component Regression. This paper also takes the initiative to compare CART to K-Means, the wired network traffic clustering scheme, and shows that CART is more accurate for classification than is K-Means.