New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
On Hardware for Computing Exponential and Trigonometric Functions
IEEE Transactions on Computers
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice
ACM Transactions on Computer Systems (TOCS)
Finding Repeated Elements
estWin: Online data stream mining of recent frequent itemsets by sliding window method
Journal of Information Science
What's hot and what's not: tracking most frequent items dynamically
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
A measurement study of correlations of internet flow characteristics
Computer Networks: The International Journal of Computer and Telecommunications Networking
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Modeling conservative updates in multi-hash approximate count sketches
Proceedings of the 24th International Teletraffic Congress
Hi-index | 0.00 |
In the paper we present an improved version of multistage hashing based algorithm, used to find frequent items in a stream. Our algorithm uses low-pass filters instead of simple counters, so it concentrates more on recent items and ignores the old ones. Such behaviour is similar to sliding window based algorithms, but requires less memory and is suitable for real-time applications. The algorithm continuously gives estimates of frequencies of the most frequent items. It was tested with streams having various frequency distributions and proved to work correctly.