Analysis of the impact of sampling on NetFlow traffic classification

Authors:
Valentín Carela-Español;Pere Barlet-Ros;Albert Cabellos-Aparicio;Josep Solé-Pareta
Affiliations:
Dept. Arquitectura de Computadors, Universitat Politècnica de Catalunya (UPC), Campus Nord, Edif. D6, C. Jordi Girona, 1-3, 08034 Barcelona, Spain;Dept. Arquitectura de Computadors, Universitat Politècnica de Catalunya (UPC), Campus Nord, Edif. D6, C. Jordi Girona, 1-3, 08034 Barcelona, Spain;Dept. Arquitectura de Computadors, Universitat Politècnica de Catalunya (UPC), Campus Nord, Edif. D6, C. Jordi Girona, 1-3, 08034 Barcelona, Spain;Dept. Arquitectura de Computadors, Universitat Politècnica de Catalunya (UPC), Campus Nord, Edif. D6, C. Jordi Girona, 1-3, 08034 Barcelona, Spain
Venue:
Computer Networks: The International Journal of Computer and Telecommunications Networking
Year:
2011

Citing 37
Cited 4

C4.5: programs for machine learning

C4.5: programs for machine learning
Properties and prediction of flow statistics from sampled packet streams

Proceedings of the 2nd ACM SIGCOMM Workshop on Internet measurment
Accurate, scalable in-network identification of p2p traffic using application signatures

Proceedings of the 13th international conference on World Wide Web
Flow classification by histograms: or how to go on safari in the internet

Proceedings of the joint international conference on Measurement and modeling of computer systems
Transport layer identification of P2P traffic

Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification

Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
Internet traffic classification using bayesian analysis techniques

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
BLINC: multilevel traffic classification in the dark

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
ACAS: automated construction of application signatures

Proceedings of the 2005 ACM SIGCOMM workshop on Mining network data
Automated Traffic Classification and Application Identification using Machine Learning

LCN '05 Proceedings of the The IEEE Conference on Local Computer Networks 30th Anniversary
Traffic classification on the fly

ACM SIGCOMM Computer Communication Review
Traffic classification using clustering algorithms

Proceedings of the 2006 SIGCOMM workshop on Mining network data
A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification

ACM SIGCOMM Computer Communication Review
Impact of packet sampling on anomaly detection metrics

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Is sampled data sufficient for anomaly detection?

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Traffic classification through simple statistical fingerprinting

ACM SIGCOMM Computer Communication Review
Identifying and discriminating between web and peer-to-peer traffic in the network core

Proceedings of the 16th international conference on World Wide Web
Byte me: a case for byte accuracy in traffic classification

Proceedings of the 3rd annual ACM workshop on Mining network data
Offline/realtime traffic classification using semi-supervised learning

Performance Evaluation
Lightweight application classification for network management

Proceedings of the 2007 SIGCOMM workshop on Internet network management
Load shedding in network monitoring applications

ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Early application identification

CoNEXT '06 Proceedings of the 2006 ACM CoNEXT conference
Estimating Flow Length Distributions Using Least Square Method and Maximum Likelihood Estimation

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part IV: ICCS 2007
Internet traffic classification demystified: myths, caveats, and the best practices

CoNEXT '08 Proceedings of the 2008 ACM CoNEXT Conference
Portscan Detection with Sampled NetFlow

TMA '09 Proceedings of the First International Workshop on Traffic Monitoring and Analysis
GTVS: Boosting the Collection of Application Traffic Ground Truth

TMA '09 Proceedings of the First International Workshop on Traffic Monitoring and Analysis
A measurement study of correlations of Internet flow characteristics

Computer Networks: The International Journal of Computer and Telecommunications Networking
Early recognition of encrypted applications

PAM'07 Proceedings of the 8th international conference on Passive and active network measurement
On the use of accounting data for QoS-aware IP network planning

ITC20'07 Proceedings of the 20th international teletraffic conference on Managing traffic performance in converged networks
On the validation of traffic classification algorithms

PAM'08 Proceedings of the 9th international conference on Passive and active network measurement
Understanding and evaluating the impact of sampling on anomaly detection techniques

MILCOM'06 Proceedings of the 2006 IEEE conference on Military communications
Toward the accurate identification of network applications

PAM'05 Proceedings of the 6th international conference on Passive and Active Network Measurement
Traffic classification using a statistical approach

PAM'05 Proceedings of the 6th international conference on Passive and Active Network Measurement
Self-Learning IP traffic classification based on statistical flow characteristics

PAM'05 Proceedings of the 6th international conference on Passive and Active Network Measurement
A survey of techniques for internet traffic classification using machine learning

IEEE Communications Surveys & Tutorials
Impact of Packet Sampling on Portscan Detection

IEEE Journal on Selected Areas in Communications
Bayesian Neural Networks for Internet Traffic Classification

IEEE Transactions on Neural Networks

Internet traffic classification using multifractal analysis approach

Proceedings of the 15th Communications and Networking Simulation Symposium
Exploiting packet-sampling measurements for traffic characterization and classification

International Journal of Network Management
A netflow v9 measurement system with network performance function

IDCS'12 Proceedings of the 5th international conference on Internet and Distributed Computing Systems
FaRNet: Fast recognition of high-dimensional patterns from big network traffic data

Computer Networks: The International Journal of Computer and Telecommunications Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

The traffic classification problem has recently attracted the interest of both network operators and researchers. Several machine learning (ML) methods have been proposed in the literature as a promising solution to this problem. Surprisingly, very few works have studied the traffic classification problem with Sampled NetFlow data. However, Sampled NetFlow is a widely extended monitoring solution among network operators. In this paper we aim to fulfill this gap. First, we analyze the performance of current ML methods with NetFlow by adapting a popular ML-based technique. The results show that, although the adapted method is able to obtain similar accuracy than previous packet-based methods (~90%), its accuracy degrades drastically in the presence of sampling. In order to reduce this impact, we propose a solution to network operators that is able to operate with Sampled NetFlow data and achieve good accuracy in the presence of sampling.