Early application identification

  • Authors:
  • Laurent Bernaille;Renata Teixeira;Kave Salamatian

  • Affiliations:
  • Université Pierre et Marie Curie, Paris-FRANCE;Université Pierre et Marie Curie, Paris-FRANCE;Université Pierre et Marie Curie, Paris-FRANCE

  • Venue:
  • CoNEXT '06 Proceedings of the 2006 ACM CoNEXT conference
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The automatic detection of applications associated with network traffic is an essential step for network security and traffic engineering. Unfortunately, simple port-based classification methods are not always efficient and systematic analysis of packet payloads is too slow. Most recent research proposals use flow statistics to classify traffic flows once they are finished, which limit their applicability for online classification. In this paper, we evaluate the feasibility of application identification at the beginning of a TCP connection. Based on an analysis of packet traces collected on eight different networks, we find that it is possible to distinguish the behavior of an application from the observation of the size and the direction of the first few packets of the TCP connection. We apply three techniques to cluster TCP connections: K-Means, Gaussian Mixture Model and spectral clustering. Resulting clusters are used together with assignment and labeling heuristics to design classifiers. We evaluate these classifiers on different packet traces. Our results show that the first four packets of a TCP connection are sufficient to classify known applications with an accuracy over 90% and to identify new applications as unknown with a probability of 60%.