MINETRAC: mining flows for unsupervised analysis & semi-supervised classification

Authors:
Pedro Casas;Johan Mazel;Philippe Owezarski
Affiliations:
CNRS/ LAAS/ Toulouse Cedex, France, and Universite de Toulouse/ UPS, INSA, INP, ISAE/ UT, UTM, LAAS/ Toulouse Cedex, France;CNRS/ LAAS/ Toulouse Cedex, France, and Universite de Toulouse/ UPS, INSA, INP, ISAE/ UT, UTM, LAAS/ Toulouse Cedex, France;CNRS/ LAAS/ Toulouse Cedex, France, and Universite de Toulouse/ UPS, INSA, INP, ISAE/ UT, UTM, LAAS/ Toulouse Cedex, France
Venue:
Proceedings of the 23rd International Teletraffic Congress
Year:
2011

Citing 14
Cited 0

Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Subspace clustering for high dimensional data: a review

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification

Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
Combining Multiple Clusterings Using Evidence Accumulation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Internet traffic classification using bayesian analysis techniques

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Traffic classification using clustering algorithms

Proceedings of the 2006 SIGCOMM workshop on Mining network data
A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification

ACM SIGCOMM Computer Communication Review
Semi-supervised network traffic classification

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A framework for clustering evolving data streams

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Accurate, Fine-Grained Classification of P2P-TV Applications by Simply Counting Packets

TMA '09 Proceedings of the First International Workshop on Traffic Monitoring and Analysis
GT: picking up the truth from the ground for internet traffic

ACM SIGCOMM Computer Communication Review
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
On the validation of traffic classification algorithms

PAM'08 Proceedings of the 9th international conference on Passive and active network measurement
A survey of techniques for internet traffic classification using machine learning

IEEE Communications Surveys & Tutorials

Quantified Score

Hi-index	0.00

Visualization

Abstract

Driven by the well-known limitations of port-based and payload-based analysis techniques, the use of Machine Learning for Internet traffic analysis and classification has become a fertile research area during the past half-decade. In this paper we introduce MINETRAC, a combination of unsupervised and semi-supervised machine learning techniques capable of identifying and classifying different classes of IP flows sharing similar characteristics. The unsupervised analysis is accomplished by means of robust clustering techniques, using Sub-Space Clustering, Evidence Accumulation, and Hierarchical Clustering algorithms to explore inter-flows structure. MINETRAC permits to identify natural groupings of traffic flows, combining the evidence of data structure provided by different partitions of the same set of traffic flows. Automatic classification is performed by means of semi-supervised learning, using only a small fraction of ground-truth flows to map the identified clusters into their associated most-probable originating network service or application. We evaluate the performance of MINETRAC using real traffic traces, additionally comparing its performance against previously proposed clustering-based flow analysis methods and supervised/semi-supervised classification approaches.