Identify P2P traffic by inspecting data transfer behavior

  • Authors:
  • Ke Xu;Ming Zhang;Mingjiang Ye;Dah Ming Chiu;Jianping Wu

  • Affiliations:
  • Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science, Tsinghua University, Beijing 100084, PR China;Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science, Tsinghua University, Beijing 100084, PR China;Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science, Tsinghua University, Beijing 100084, PR China;Department of Information Engineering, The Chinese University of Hong Kong, Hong Kong, PR China;Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science, Tsinghua University, Beijing 100084, PR China

  • Venue:
  • Computer Communications
  • Year:
  • 2010

Quantified Score

Hi-index 0.24

Visualization

Abstract

Classifying network traffic according to its applications is important to a broad range of network areas. Since new applications, especially P2P applications, no longer use well-known fixed port numbers, the native port-based traffic classification technique has become much less effective. In this paper, we propose a novel approach to identify P2P traffic by leveraging the data transfer behavior of P2P applications. The behavior investigated in the paper is that downloaded data from a P2P host will be uploaded to other hosts later. To find the shared data of downloading flows and uploading flows online, the content-based partitioning scheme is proposed to partition the flows into data blocks. Flows sharing the same data blocks are identified as P2P flows. Theoretical analysis proves that the content-based partitioning scheme is stable and effective. Experiments on various P2P applications demonstrate that the method is generic and can be applied to most P2P applications. Experimental results show that the algorithm can identify P2P applications accurately while only keeping a small set of data blocks.