Packet classification in large ISPs: design and evaluation of decision tree classifiers
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Estimating arbitrary subset sums with few probes
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The DLT priority sampling is essentially optimal
Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
IEEE/ACM Transactions on Networking (TON)
Confidence intervals for priority sampling
SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Fisher information of sampled packets: an application to flow size estimation
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Optimal combination of sampled network measurements
IMC '05 Proceedings of the 5th ACM SIGCOMM conference on Internet Measurement
ProgME: towards programmable network measurement
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
Priority sampling for estimation of arbitrary subset sums
Journal of the ACM (JACM)
Deterministic algorithms for sampling count data
Data & Knowledge Engineering
Probabilistic lossy counting: an efficient algorithm for finding heavy hitters
ACM SIGCOMM Computer Communication Review
A generic language for application-specific flow sampling
ACM SIGCOMM Computer Communication Review
Optimal sampling in state space models with applications to network monitoring
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
CSAMP: a system for network-wide flow monitoring
NSDI'08 Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation
A stratified traffic sampling methodology for seeing the big picture
Computer Networks: The International Journal of Computer and Telecommunications Networking
Tighter estimation using bottom k sketches
Proceedings of the VLDB Endowment
A Space-Efficient Fair Packet Sampling Algorithm
APNOMS '08 Proceedings of the 11th Asia-Pacific Symposium on Network Operations and Management: Challenges for Next Generation Network Operations and Service Management
Stream sampling for variance-optimal estimation of subset sums
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
An analysis of packet sampling in the frequency domain
Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference
Impact of prefix-match changes on IP reachability
Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference
Composable, scalable, and accurate weight summarization of unaggregated data sets
Proceedings of the VLDB Endowment
Estimating flow distribution by using difference information of multiple packet samplings
ICOIN'09 Proceedings of the 23rd international conference on Information Networking
Survey on traffic of metro area network with measurement on-line
ITC20'07 Proceedings of the 20th international teletraffic conference on Managing traffic performance in converged networks
On the variance of subset sum estimation
ESA'07 Proceedings of the 15th annual European conference on Algorithms
ProgME: towards programmable network measurement
IEEE/ACM Transactions on Networking (TON)
Walking on a graph with a magnifying glass: stratified sampling via weighted random walks
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Walking on a graph with a magnifying glass: stratified sampling via weighted random walks
ACM SIGMETRICS Performance Evaluation Review - Performance evaluation review
Efficient Stream Sampling for Variance-Optimal Estimation of Subset Sums
SIAM Journal on Computing
Efficient packet sampling for accurate traffic measurements
Computer Networks: The International Journal of Computer and Telecommunications Networking
Fair sampling across network flow measurements
Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Estimating sum by weighted sampling
ICALP'07 Proceedings of the 34th international conference on Automata, Languages and Programming
Per-flow traffic measurement through randomized counter sharing
IEEE/ACM Transactions on Networking (TON)
Bottom-k and priority sampling, set similarity and subset sums with minimal independence
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Modeling residual-geometric flow sampling
IEEE/ACM Transactions on Networking (TON)
Hi-index | 754.84 |
This paper deals with sampling objects from a large stream. Each object possesses a size, and the aim is to be able to estimate the total size of an arbitrary subset of objects whose composition is not known at the time of sampling. This problem is motivated from network measurements in which the objects are flow records exported by routers and the sizes are the number of packet or bytes reported in the record. Subsets of interest could be flows from a certain customer or flows from a worm attack. This paper introduces threshold sampling as a sampling scheme that optimally controls the expected volume of samples and the variance of estimators over any classification of flows. It provides algorithms for dynamic control of sample volumes and evaluates them on flow data gathered from a commercial Internet Protocol (IP) network. The algorithms are simple to implement and robust to variation in network conditions. The work reported here has been applied in the measurement infrastructure of the commercial IP network. To not have employed sampling would have entailed an order of magnitude greater capital expenditure to accommodate the measurement traffic and its processing.