Application of sampling methodologies to network traffic characterization
SIGCOMM '93 Conference proceedings on Communications architectures, protocols and applications
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Trajectory sampling for direct traffic observation
IEEE/ACM Transactions on Networking (TON)
Charging from sampled network usage
IMW '01 Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement
New directions in traffic measurement and accounting
Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Flow sampling under hard resource constraints
Proceedings of the joint international conference on Measurement and modeling of computer systems
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
A robust system for accurate real-time summaries of internet traffic
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Estimating arbitrary subset sums with few probes
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Sampling algorithms in a stream operator
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Estimating flow distributions from sampled flow statistics
IEEE/ACM Transactions on Networking (TON)
The DLT priority sampling is essentially optimal
Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
Confidence intervals for priority sampling
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Impact of packet sampling on anomaly detection metrics
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Is sampled data sufficient for anomaly detection?
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Optimal combination of sampled network measurements
IMC '05 Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement
The power of slicing in internet flow measurement
IMC '05 Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement
Sketching unaggregated data streams for subpopulation-size queries
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Processing top k queries from samples
CoNEXT '06 Proceedings of the 2006 ACM CoNEXT conference
Computational challenges in parsing by classification
CHSLP '06 Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
Composable, scalable, and accurate weight summarization of unaggregated data sets
Proceedings of the VLDB Endowment
Do you know your IQ?: a research agenda for information quality in systems
ACM SIGMETRICS Performance Evaluation Review
Review: A survey of network flow applications
Journal of Network and Computer Applications
Hi-index | 0.00 |
Measurement, collection, and interpretation of network usage data commonly involves multiple stage of sampling and aggregation. Examples include sampling packets, aggregating them into flow statistics at a router, sampling and aggregation of usage records in a network data repository for reporting, query and archiving. Although unbiased estimates of packet, bytes and flows usage can be formed for each sampling operation, for many applications it is crucial to know the inherent estimation error. Previous work in this area has been limited mainly to analyzing the estimator variance for particular methods, e.g., independent packet sampling. However, the variance is of limited use for more general sampling methods, where the estimate may not be well approximated by a Gaussian distribution. This motivates our paper, in which we establish Chernoff bounds on the likelihood of estimation error in a general multistage combination of measurement sampling and aggregation. We derive the scale against which errors are measured, in terms of the constituent sampling and aggregation operations. In particular this enables us to obtain rigorous confidence intervals around any given estimate. We apply our method to a number of sampling schemes both in the literature and currently deployed, including sampling of packet sampled NetFlow records, Sample and Hold, and Flow Slicing. We obtain one particularly striking result in the first case: that for a range of parameterizations, packet sampling has no additional impact on the estimator confidence derived from our bound, beyond that already imposed by flow sampling.