Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Random sampling from database files: a survey
SSDBM V Proceedings of the fifth international conference on Statistical and scientific database management
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Sequential sampling procedures for query size estimation
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
A model for the prediction of R-tree performance
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Random sampling for histogram construction: how much is enough?
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Wavelet-based histograms for selectivity estimation
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data cube approximation and histograms via wavelets
Proceedings of the seventh international conference on Information and knowledge management
Selectivity estimation in spatial databases
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Self-tuning histograms: building histograms without looking at data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Multi-dimensional selectivity estimation using compressed histogram information
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A comparison of selectivity estimators for range queries on metric attributes
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Compressed data cubes for OLAP aggregate query approximation on continuous dimensions
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
On approximating rectangle tiling and packing
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Optimal Histograms with Quality Guarantees
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Evaluating Top-k Selection Queries
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Probabilistic Optimization of Top N Queries
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Histogram-Based Approximation of Set-Valued Query-Answers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Incremental Maintenance of Approximate Histograms
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Approximate Answers to Aggregate Queries on a Data Cube
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Range Selectivity Estimation for Continuous Attributes
SSDBM '99 Proceedings of the 11th International Conference on Scientific and Statistical Database Management
Independence is good: dependency-based histogram synopses for high-dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Progressive approximate aggregate queries with a multi-resolution tree structure
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
How to evaluate multiple range-sum queries progressively
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Exploiting statistics on query expressions for optimization
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Top-k selection queries over relational databases: Mapping strategies and performance evaluation
ACM Transactions on Database Systems (TODS)
Fast incremental maintenance of approximate histograms
ACM Transactions on Database Systems (TODS)
Approximated trial and error analysis in scientific databases
Information Systems - Special issue: Best papers from EDBT 2002
Pass efficient algorithms for approximating large matrices
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
OLAP-Based Data Mining for Business Intelligence Applications in Telecommunications and E-commerce
DNIS '00 Proceedings of the International Workshop on Databases in Networked Information Systems
Optimizing Scientific Databases for Client Side Data Processing
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
ProPolyne: A Fast Wavelet-Based Algorithm for Progressive Evaluation of Polynomial Range-Sum Queries
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Approximate Query Processing: Taming the TeraBytes
Proceedings of the 27th International Conference on Very Large Data Bases
An OLAP-based Scalable Web Access Analysis Engine
DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Spatial queries in dynamic environments
ACM Transactions on Database Systems (TODS)
Utilizing histogram information
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets
IEEE Transactions on Knowledge and Data Engineering
The power-method: a comprehensive estimation technique for multi-dimensional queries
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Analysis of predictive spatio-temporal queries
ACM Transactions on Database Systems (TODS)
Distributed deviation detection in sensor networks
ACM SIGMOD Record
A multi-dimensional histogram for selectivity estimation and fast approximate query answering
CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
Probabilistic wavelet synopses
ACM Transactions on Database Systems (TODS)
Querying about the Past, the Present, and the Future in Spatio-Temporal Databases
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Online maintenance of very large random samples
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Conditional selectivity for statistics on query expressions
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering
Range Aggregate Processing in Spatial Databases
IEEE Transactions on Knowledge and Data Engineering
Structure choices for two-dimensional histogram construction
CASCON '04 Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research
Deterministic wavelet thresholding for maximum-error metrics
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Synopses for query optimization: a space-complexity perspective
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximation algorithms for array partitioning problems
Journal of Algorithms
Selectivity estimators for multidimensional range queries over real attributes
The VLDB Journal — The International Journal on Very Large Data Bases
Proceedings of the 8th ACM international workshop on Data warehousing and OLAP
Wavelet synopses for general error metrics
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Synopses for query optimization: A space-complexity perspective
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Online outlier detection in sensor data using non-parametric models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Estimating query result sizes for proxy caching in scientific database federations
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Compressed histograms with arbitrary bucket layouts for selectivity estimation
Information Sciences: an International Journal
Branch-and-bound processing of ranked queries
Information Systems
Approximate range---sum query answering on data cubes with probabilistic guarantees
Journal of Intelligent Information Systems
Optimized stratified sampling for approximate query processing
ACM Transactions on Database Systems (TODS)
The Threshold Algorithm: From Middleware Systems to the Relational Engine
IEEE Transactions on Knowledge and Data Engineering
ROLAP implementations of the data cube
ACM Computing Surveys (CSUR)
Selectivity estimation of range queries based on data density approximation via cosine series
Data & Knowledge Engineering
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Distributed top-N query processing with possibly uncooperative local systems
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Robust estimation with sampling and approximate pre-aggregation
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A privacy-preserving index for range queries
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Rk-hist: an r-tree based histogram for multi-dimensional selectivity estimation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Anytime measures for top-k algorithms
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Proactive and reactive multi-dimensional histogram maintenance for selectivity estimation
Journal of Systems and Software
Histograms based on the minimum description length principle
The VLDB Journal — The International Journal on Very Large Data Bases
Maintaining very large random samples using the geometric file
The VLDB Journal — The International Journal on Very Large Data Bases
Unsupervised Outlier Detection in Sensor Networks Using Aggregation Tree
ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Plot Query Processing with Wavelets
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Multiple-Objective Compression of Data Cubes in Cooperative OLAP Environments
ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
H-IQTS: a semantics-aware histogram for compressing categorical OLAP data
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
A new approach to building histogram for selectivity estimation in query processing optimization
Computers & Mathematics with Applications
LCS-Hist: taming massive high-dimensional data cube compression
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Anytime measures for top-k algorithms on exact and fuzzy data sets
The VLDB Journal — The International Journal on Very Large Data Bases
Enabling OLAP in mobile environments via intelligent data cube compression techniques
Journal of Intelligent Information Systems
Maintenance strategies for routing indexes
Distributed and Parallel Databases
Design of the ERATOSTHENES OLAP server
PCI'01 Proceedings of the 8th Panhellenic conference on Informatics
An efficient histogram method for outlier detection
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
ISMIS'08 Proceedings of the 17th international conference on Foundations of intelligent systems
A secure multi-dimensional partition based index in DAS
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Hierarchically organized skew-tolerant histograms for geographic data objects
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Efficient selectivity estimation by histogram construction based on subspace clustering
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
Efficient construction of histograms for multidimensional data using quad-trees
Decision Support Systems
An efficient algorithm for computing range-groupby queries
DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
On futuristic query processing in data streams
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Density estimation for spatial data streams
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
Sensitivity of self-tuning histograms: query order affecting accuracy and robustness
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Histograms as statistical estimators for aggregate queries
Information Systems
Efficiently adapting graphical models for selectivity estimation
The VLDB Journal — The International Journal on Very Large Data Bases
Efficiently compressing OLAP data cubes via R-tree based recursive partitions
ISMIS'12 Proceedings of the 20th international conference on Foundations of Intelligent Systems
Quality and efficiency for kernel density estimates in large data
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
How the live web feels about events
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Finding approximate answers to multi-dimensional range queries over real valued attributes has significant applications in data exploration and database query optimization. In this paper we consider the following problem: given a table of d attributes whose domain is the real numbers, and a query that specifies a range in each dimension, find a good approximation of the number of records in the table that satisfy the query.We present a new histogram technique that is designed to approximate the density of multi-dimensional datasets with real attributes. Our technique finds buckets of variable size, and allows the buckets to overlap. Overlapping buckets allow more efficient approximation of the density. The size of the cells is based on the local density of the data. This technique leads to a faster and more compact approximation of the data distribution. We also show how to generalize kernel density estimators, and how to apply them on the multi-dimensional query approximation problem.Finally, we compare the accuracy of the proposed techniques with existing techniques using real and synthetic datasets.