Computational geometry: an introduction
Computational geometry: an introduction
Robust regression and outlier detection
Robust regression and outlier detection
Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Space-efficient online computation of quantile summaries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data streams: algorithms and applications
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Proceedings of the 17th International Conference on Data Engineering
Gigascope: a stream database for network applications
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Range counting over multidimensional data streams
SCG '04 Proceedings of the twentieth annual symposium on Computational geometry
Data streaming algorithms for efficient and accurate estimation of flow size distribution
Proceedings of the joint international conference on Measurement and modeling of computer systems
Holistic UDAFs at streaming speeds
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Medians and beyond: new aggregation techniques for sensor networks
SenSys '04 Proceedings of the 2nd international conference on Embedded networked sensor systems
Progressive skyline computation in database systems
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
Space- and time-efficient deterministic algorithms for biased quantiles over data streams
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Top-k skyline: a unified approach
OTM'05 Proceedings of the 2005 OTM Confederated international conference on On the Move to Meaningful Internet Systems
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Adaptive spatial partitioning for multidimensional data streams
ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
Ranking uncertain sky: The probabilistic top-k skyline operator
Information Systems
Hi-index | 0.00 |
Much real data consists of more than one dimension, such as financial transactions (eg, price × volume) and IP network flows (eg, duration × numBytes), and capture relationships between the variables. For a single dimension, quantiles are intuitive and robust descriptors. Processing and analyzing such data, particularly in data warehouse or data streaming settings, requires similarly robust and informative statistical descriptors that go beyond one-dimension. Applying quantile methods to summarize a multidimensional distribution along only singleton attributes ignores the rich dependence amongst the variables.In this paper, we present new skyline-based statistical descriptors for capturing the distributions over pairs of dimensions. They generalize the notion of quantiles in the individual dimensions, and also incorporate properties of the joint distribution. We introduce 茂戮驴-quantoursand 茂戮驴-radials, which are skyline points over subsets of the data, and propose (茂戮驴, 茂戮驴)-quantiles, found from the union of these skylines, as statistical descriptors of two-dimensional distributions. We present efficient online algorithms for tracking (茂戮驴,茂戮驴)-quantiles on two-dimensional streams using guaranteed small space. We identify the principal properties of the proposed descriptors and perform extensive experiments with synthetic and real IP traffic data to study the efficiency of our proposed algorithms.