Novel techniques and models for network traffic profiling: characterizing the unknown

  • Authors:
  • Michalis Faloutsos;Thomas Karagiannis

  • Affiliations:
  • University of California, Riverside;University of California, Riverside

  • Venue:
  • Novel techniques and models for network traffic profiling: characterizing the unknown
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the prevalence of novel applications (peer-to-peer, real-time media, VoIP, etc.) and the manifestation of growing malicious activity transform the nature of the Internet traffic, collection and interpretation of empirical Internet data remains a critical yet challenging task. While measurement provides the only accurate source of information regarding the usage of the network resources, the community no longer enjoys the fleeting benefit of traditional network traffic, which was relatively easily profiled due to the existence of limited applications with well-defined structure. This thesis addresses the challenge of profiling the "unknown" by focusing on robust network measurements and characterization of network traffic in the face of a constantly changing traffic mix. This work introduces a novel perspective into the analysis of Internet traffic based on two key elements: (a) shifting the focus from modeling the statistics of the individual network flows to studying the behavior of the Internet host, and (b) understanding the intrinsic properties of application connection patterns. We first describe how this concept of traffic analysis can be successfully applied to the identification of peer-to-peer traffic. This approach is a first attempt to characterize P2P traffic using only knowledge of network dynamics rather than any user payload, and we demonstrate its efficacy by presenting extensive evidence that P2P traffic continues to grow unabatedly, contrary to reports in the popular media. We then generalize our analysis in order to characterize the majority of popular applications in today's Internet, and to classify traffic flows according to the applications that generate them. We introduce the notion of graphlets, which are graphs reflecting the "most common" transport-layer behavior for a particular application and apply this approach on three real traces by showing that it is able to classify 80%-90% of the traffic with more than 95% accuracy. Finally, we discuss how this concept of traffic modeling stimulates interesting new directions for future research in various areas and apply its basic principles to end-host profiling with the goal of securing enterprise and campus networks. We present a novel way to profile individual user behavior through a graph-based footprint that generates a compact, robust and intuitive description of user activity. We demonstrate the effectiveness of our host profiling technique by uncovering insights regarding typical user behavior using data from two enterprise networks. We then show the ability of our approach to expose anomalous behavior and non-trivial attacks using real data and controlled experiments.