HIDS: a multifunctional generator of hierarchical data streams

Authors:
Xiaoyu Wang;Hongyan Liu;Daoxin Er
Affiliations:
Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, Haiti
Venue:
ACM SIGMIS Database
Year:
2009

Citing 15
Cited 0

Space/time trade-offs in hash coding with allowable errors

Communications of the ACM
Data streams: algorithms and applications

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Finding Frequent Items in Data Streams

ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
Frequency Estimation of Internet Packet Streams with Limited Space

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
A simple algorithm for finding frequent elements in streams and bags

ACM Transactions on Database Systems (TODS)
What's hot and what's not: tracking most frequent items dynamically

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Stable distributions, pseudorandom generators, embeddings and data stream computation

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice

ACM Transactions on Computer Systems (TOCS)
Diamond in the rough: finding Hierarchical Heavy Hitters in multi-dimensional data

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Space complexity of hierarchical heavy hitters in multi-dimensional data streams

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximate frequency counts over data streams

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient computation of frequent and top-k elements in data streams

ICDT'05 Proceedings of the 10th international conference on Database Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the research of high-speed data streams, large amounts of synthetic data are needed. These days, more and more researchers focus on hierarchical multi-dimensional data streams or data sets, which is beyond the ability of traditional synthetic data generators. In this paper we propose a two-phased method to generate hierarchical multi-dimensional data streams, in which a tree-like structure is built first, and then an unlimited number of items chosen among the tree leaves according to a distribution are inserted into the stream. Our generator, HIDS, integrates all of the functions of existing data generators, and can customize the tree structure according to usersý requirements, producing tree structures such as equal-depth trees, equal-fan-out trees, balanced trees and different-fan-out trees. An experimental study using real data streams shows that HIDS can generate data streams tailored to specific applications.