Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Subspace clustering for high dimensional data: a review
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
Combining Multiple Clusterings Using Evidence Accumulation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Internet traffic classification using bayesian analysis techniques
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Traffic classification using clustering algorithms
Proceedings of the 2006 SIGCOMM workshop on Mining network data
ACM SIGCOMM Computer Communication Review
Semi-supervised network traffic classification
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Accurate, Fine-Grained Classification of P2P-TV Applications by Simply Counting Packets
TMA '09 Proceedings of the First International Workshop on Traffic Monitoring and Analysis
GT: picking up the truth from the ground for internet traffic
ACM SIGCOMM Computer Communication Review
Data clustering: 50 years beyond K-means
Pattern Recognition Letters
On the validation of traffic classification algorithms
PAM'08 Proceedings of the 9th international conference on Passive and active network measurement
A survey of techniques for internet traffic classification using machine learning
IEEE Communications Surveys & Tutorials
Hi-index | 0.00 |
Driven by the well-known limitations of port-based and payload-based analysis techniques, the use of Machine Learning for Internet traffic analysis and classification has become a fertile research area during the past half-decade. In this paper we introduce MINETRAC, a combination of unsupervised and semi-supervised machine learning techniques capable of identifying and classifying different classes of IP flows sharing similar characteristics. The unsupervised analysis is accomplished by means of robust clustering techniques, using Sub-Space Clustering, Evidence Accumulation, and Hierarchical Clustering algorithms to explore inter-flows structure. MINETRAC permits to identify natural groupings of traffic flows, combining the evidence of data structure provided by different partitions of the same set of traffic flows. Automatic classification is performed by means of semi-supervised learning, using only a small fraction of ground-truth flows to map the identified clusters into their associated most-probable originating network service or application. We evaluate the performance of MINETRAC using real traffic traces, additionally comparing its performance against previously proposed clustering-based flow analysis methods and supervised/semi-supervised classification approaches.