Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Data streams: algorithms and applications
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Multiple-Level Association Rules from Large Databases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Generalized Association Rules
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams
ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Frequency Estimation of Internet Packet Streams with Limited Space
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags
ACM Transactions on Database Systems (TODS)
What's hot and what's not: tracking most frequent items dynamically
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Stable distributions, pseudorandom generators, embeddings and data stream computation
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice
ACM Transactions on Computer Systems (TOCS)
Diamond in the rough: finding Hierarchical Heavy Hitters in multi-dimensional data
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Space complexity of hierarchical heavy hitters in multi-dimensional data streams
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient computation of frequent and top-k elements in data streams
ICDT'05 Proceedings of the 10th international conference on Database Theory
Hi-index | 0.00 |
In the research of high-speed data streams, large amounts of synthetic data are needed. These days, more and more researchers focus on hierarchical multi-dimensional data streams or data sets, which is beyond the ability of traditional synthetic data generators. In this paper we propose a two-phased method to generate hierarchical multi-dimensional data streams, in which a tree-like structure is built first, and then an unlimited number of items chosen among the tree leaves according to a distribution are inserted into the stream. Our generator, HIDS, integrates all of the functions of existing data generators, and can customize the tree structure according to usersý requirements, producing tree structures such as equal-depth trees, equal-fan-out trees, balanced trees and different-fan-out trees. An experimental study using real data streams shows that HIDS can generate data streams tailored to specific applications.