New directions in traffic measurement and accounting
Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Adaptive filters for continuous queries over distributed data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Finding frequent items in data streams
Theoretical Computer Science - Special issue on automata, languages and programming
Operator scheduling in data stream systems
The VLDB Journal — The International Journal on Very Large Data Bases
Finding (Recently) Frequent Items in Distributed Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Approximate counts and quantiles over sliding windows
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
An improved data stream summary: the count-min sketch and its applications
Journal of Algorithms
Finding global icebergs over distributed data sets
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A distributed host-based worm detection system
Proceedings of the 2006 SIGCOMM workshop on Large-scale attack defense
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Sketching unaggregated data streams for subpopulation-size queries
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Accurate and efficient SLA compliance monitoring
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
Algorithms for distributed functional monitoring
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Finding frequent items in data streams
Proceedings of the VLDB Endowment
Maximum likelihood estimation of the flow size distribution tail index from sampled packet data
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
ALPi: A DDoS Defense System for High-Speed Networks
IEEE Journal on Selected Areas in Communications
Resource/accuracy tradeoffs in software-defined measurement
Proceedings of the second ACM SIGCOMM workshop on Hot topics in software defined networking
Hi-index | 0.00 |
Discovering icebergs in distributed streams of data is an important problem for a number of applications in networking and databases. While previous work has concentrated on measuring these icebergs in the non-distributed streaming case or in the non-streaming distributed case, we present a general framework that allows for distributed processing across multiple streams of data. We compare several of the state-of-the-art streaming algorithms for estimating local elephants in the individual streams. However, since an iceberg may be hidden by being distributed across many different streams, we add a sampling component to handle such cases. We provide a novel taxonomy of current sketches and perform a thorough analysis of the strengths and weaknesses of each scheme under various QoS metrics, using both real and synthetic Internet trace data. We summarize their performance and discuss the implications for the future design of sketches.