Synoptic graphlet: bridging the gap between supervised and unsupervised profiling of host-level network traffic

  • Authors:
  • Yosuke Himura;Kensuke Fukuda;Kenjiro Cho;Pierre Borgnat;Patrice Abry;Hiroshi Esaki

  • Affiliations:
  • University of Tokyo, Tokyo, Japan;National Institute of Informatics and PRESTO, JST, Tokyo, Japan;Internet Initiative Japan, Tokyo, Japan and Keio University, Tokyo, Japan;CNRS and École Normale Supérieure de Lyon, Laboratoire de Physique, Lyon, France;CNRS and École Normale Supérieure de Lyon, Laboratoire de Physique, Lyon, France;University of Tokyo, Tokyo, Japan

  • Venue:
  • IEEE/ACM Transactions on Networking (TON)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

End-host profiling by analyzing network traffic comes out as a major stake in traffic engineering. Graphlet constitutes an efficient and common framework for interpreting host behaviors, which essentially consists of a visual representation as a graph. However, graphlet analyses face the issues of choosing between supervised and unsupervised approaches. The former can analyze a priori defined behaviors but is blind to undefined classes, while the latter can discover new behaviors at the cost of difficult a posteriori interpretation. This paper aims at bridging the gap between the two. First, to handle unknown classes, unsupervised clustering is originally revisited by extracting a set of graphlet-inspired attributes for each host. Second, to recover interpretability for each resulting cluster, a synoptic graphlet, defined as a visual graphlet obtained by mapping from a cluster, is newly developed. Comparisons against supervised graphlet-based, port-based, and payload-based classifiers with two datasets demonstrate the effectiveness of the unsupervised clustering of graphlets and the relevance of the a posteriori interpretation through synoptic graphlets. This development is further complemented by studying evolutionary tree of synoptic graphlets, which quantifies the growth of graphlets when increasing the number of inspected packets per host.