Towards optimal sampling for flow size estimation

Authors:
Paul Tune;Darryl Veitch
Affiliations:
University of Melbourne, Melbourne, Australia;University of Melbourne, Melbourne, Australia
Venue:
Proceedings of the 8th ACM SIGCOMM conference on Internet measurement
Year:
2008

Citing 8
Cited 7

Inverting sampled traffic

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Building a better NetFlow

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Estimating flow distributions from sampled flow statistics

IEEE/ACM Transactions on Networking (TON)
Inverting sampled traffic

IEEE/ACM Transactions on Networking (TON)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)

Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Fisher information of sampled packets: an application to flow size estimation

Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Exploring estimator bias-variance tradeoffs using the uniform CRbound

IEEE Transactions on Signal Processing
A proof of the Fisher information inequality via a data processing argument

IEEE Transactions on Information Theory

Maximum likelihood estimation of the flow size distribution tail index from sampled packet data

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Toward credible evaluation of anomaly-based intrusion-detection methods

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Debugging the data plane with anteater

Proceedings of the ACM SIGCOMM 2011 conference
Efficient packet sampling for accurate traffic measurements

Computer Networks: The International Journal of Computer and Telecommunications Networking
Detection accuracy of network anomalies using sampled flow statistics

International Journal of Network Management
Towards optimal error-estimating codes through the lens of Fisher information analysis

Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Mean-variance relationship of the number of flows in traffic aggregation and its application to traffic management

Computer Networks: The International Journal of Computer and Telecommunications Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

The flow size distribution is a useful metric for traffic modeling and management. It is well known however that its estimation based on sampled data is problematic. Previous work has shown that flow sampling (FS) offers enormous statistical benefits over packet sampling, however it suffers from high resource requirements and is not currently used in routers. In this paper we present Dual Sampling, which can to a large extent provide flow-sampling-like statistical performance for packet-sampling-like computational cost. Our work is grounded in a Fisher information based approach recently used to evaluate a number of sampling schemes, excluding however FS, for TCP flows. We show how to revise and extend the approach to include FS as well as DS and others, and how to make rigorous and fair comparisons. We show how DS significantly outperforms other packet based methods, but also prove that DS is inferior to flow sampling. However, since DS is a two-parameter family of methods which includes FS as a special case, DS can be used to approach flow sampling continuously. We then describe a packet sampling based implementation of DS and analyze its key computational costs to show that router implementation is feasible. Our approach offers insights into many issues, including how the notions of 'flow quality' and 'packet gain' can be used to understand the relative performance of methods, and how the problem of optimal sampling can be formulated. Our work is theoretical with some simulation support and a case study on Internet data.