Counting large numbers of events in small registers
Communications of the ACM
Data streaming algorithms for efficient and accurate estimation of flow size distribution
Proceedings of the joint international conference on Measurement and modeling of computer systems
A data streaming algorithm for estimating subpopulation flow size distribution
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Summarizing and mining inverse distributions on data streams via dynamic inverse sampling
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Estimating flow distributions from sampled flow statistics
IEEE/ACM Transactions on Networking (TON)
Fisher information of sampled packets: an application to flow size estimation
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Counter braids: a novel counter architecture for per-flow measurement
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Modeling Internet backbone traffic at the flow level
IEEE Transactions on Signal Processing
An online framework for catching top spreaders and scanners
Computer Networks: The International Journal of Computer and Telecommunications Networking
High-speed per-flow traffic measurement with probabilistic multiplicity counting
INFOCOM'10 Proceedings of the 29th conference on Information communications
Virtual indexing based methods for estimating node connection degrees
Computer Networks: The International Journal of Computer and Telecommunications Networking
Hi-index | 0.00 |
The histogram of network flow sizes is an important yet difficult metric to estimate in network monitoring. It is important because it characterizes traffic compositions and is a crucial component of anomaly detection methods. It is difficult to estimate because of its high memory and computational requirements. Existing algorithms compute fine grained estimates for each flow size, i.e. 1, 2,... up to the maximum number observed over a finite time interval. Our approach instead relies on the insight that, while many applications require fine grained estimates of small flow sizes, i.e. {1,2,...,k} with a small k, network operators are often only interested in coarse grained estimates of larger flow sizes. Thus, we propose an estimator that outputs a binned histogram of size distributions. Our estimator computes this histogram in O(k3 + log W) operations, where W is the largest flow size of interest to the network operator, while requiring only a few bits of memory per measured flow. This translates into more than 4 fold memory savings and an exponential speedup in the estimator as compared to previous works, greatly increasing the possibility of performing on-line estimation inside a router.