Accurate, scalable in-network identification of p2p traffic using application signatures
Proceedings of the 13th international conference on World Wide Web
On the lack of typical behavior in the global Web traffic network
WWW '05 Proceedings of the 14th international conference on World Wide Web
BLINC: multilevel traffic classification in the dark
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
ACAS: automated construction of application signatures
Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data
Unexpected means of protocol inference
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Role classification of hosts within enterprise networks based on connection patterns
ATEC '03 Proceedings of the annual conference on USENIX Annual Technical Conference
Classification in Networked Data: A Toolkit and a Univariate Case Study
The Journal of Machine Learning Research
Network monitoring using traffic dispersion graphs (tdgs)
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Unconstrained endpoint profiling (googling the internet)
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Internet traffic classification demystified: myths, caveats, and the best practices
CoNEXT '08 Proceedings of the 2008 ACM CoNEXT Conference
A survey of techniques for internet traffic classification using machine learning
IEEE Communications Surveys & Tutorials
Profiling-By-Association: a resilient traffic profiling solution for the internet backbone
Proceedings of the 6th International COnference
Improving matching performance of DPI traffic classifier
Proceedings of the 2011 ACM Symposium on Applied Computing
Discriminating graphs through spectral projections
Computer Networks: The International Journal of Computer and Telecommunications Networking
Detecting malware with graph-based methods: traffic classification, botnets, and facebook scams
Proceedings of the 22nd international conference on World Wide Web companion
Hi-index | 0.00 |
We address the following questions. Is there link homophily in the application layer traffic? If so, can it be used to accurately classify traffic in network trace data without relying on payloads or properties at the flow level? Our research shows that the answers to both of these questions are affirmative in real network trace data. Specifically, we define link homophily to be the tendency for flows with common IP hosts to have the same application (P2P, Web, etc.) compared to randomly selected flows. The presence of link homophily in trace data provides us with statistical dependencies between flows that share common IP hosts. We utilize these dependencies to classify application layer traffic without relying on payloads or properties at the flow level. In particular, we introduce a new statistical relational learning algorithm, called Neighboring Link Classifier with Relaxation Labeling (NLC+RL). Our algorithm has no training phase and does not require features to be constructed. All that it needs to start the classification process is traffic information on a small portion of the initial flows, which we refer to as seeds. In all our traces, NLC+RL achieves above 90% accuracy with less than 5% seed size; it is robust to errors in the seeds and various seed-selection biases; and it is able to accurately classify challenging traffic such as P2P with over 90% Precision and Recall.