Random sampling with a reservoir
ACM Transactions on Mathematical Software (TOMS)
Computational geometry: an introduction
Computational geometry: an introduction
An efficient algorithm for sequential random sampling
ACM Transactions on Mathematical Software (TOMS)
Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Sequential sampling procedures for query size estimation
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Query size estimation by adaptive sampling
Selected papers of the 9th annual ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Improved histograms for selectivity estimation of range predicates
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Range queries in OLAP data cubes
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Histogram-based estimation techniques in database systems
Histogram-based estimation techniques in database systems
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
New sampling-based summary statistics for improving approximate query answers
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Wavelets for computer graphics: theory and applications
Wavelets for computer graphics: theory and applications
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Sampling-Based Selectivity Estimation for Joins Using Augmented Frequent Value Statistics
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Selectivity Estimation Without the Attribute Value Independence Assumption
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Incremental Maintenance of Approximate Histograms
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Data cube approximation and histograms via wavelets
Proceedings of the seventh international conference on Information and knowledge management
Self-tuning histograms: building histograms without looking at data
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Approximate computation of multidimensional aggregates of sparse data using wavelets
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
A comparison of selectivity estimators for range queries on metric attributes
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Synopsis data structures for massive data sets
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Optimal histograms for hierarchical range queries (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximating multi-dimensional aggregate range queries over real attributes
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Using wavelet decomposition to support progressive and approximate range-sum queries over data cubes
Proceedings of the ninth international conference on Information and knowledge management
Optimal and approximate computation of summary statistics for range aggregates
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
STHoles: a multidimensional workload-aware histogram
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Global optimization of histograms
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Applying the golden rule of sampling for query estimation
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Selectivity estimation using probabilistic models
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Probabilistic query models for transaction data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Managing periodically updated data in relational databases: a stochastic modeling approach
Journal of the ACM (JACM)
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast algorithms for hierarchical range histogram construction
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Dynamic multidimensional histograms
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Wavelet synopses with error guarantees
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Top-k selection queries over relational databases: Mapping strategies and performance evaluation
ACM Transactions on Database Systems (TODS)
Fast incremental maintenance of approximate histograms
ACM Transactions on Database Systems (TODS)
Maximum Likelihood Estimation of Mixture Densities for Binned and Truncated Multivariate Data
Machine Learning - Special issue: Unsupervised learning
RHist: adaptive summarization over continuous data streams
Proceedings of the eleventh international conference on Information and knowledge management
Automatic tuning of data synopses
Information Systems - Special issue: Best papers from EDBT 2002
Supporting Efficient Parametric Search of E-Commerce Data: A Loosely-Coupled Solution
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
A Framework for the Physical Design Problem for Data Synopses
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-size Estimation
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Identifying Representative Trends in Massive Time Series Data Sets Using Sketches
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Approximate Query Processing Using Wavelets
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Dynamic Maintenance of Wavelet-Based Histograms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries
Proceedings of the 27th International Conference on Very Large Data Bases
Approximate Query Processing: Taming the TeraBytes
Proceedings of the 27th International Conference on Very Large Data Bases
On Linear-Spline Based Histograms
WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
Analysis of Accuracy of Data Reduction Techniques
DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
A Classification of Skew Effects in Parallel Database Systems
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Wavelet-Based Cost Estimation for Spatial Queries
SSTD '01 Proceedings of the 7th International Symposium on Advances in Spatial and Temporal Databases
Binary-Tree Histograms with Tree Indices
DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Approximate query processing using wavelets
The VLDB Journal — The International Journal on Very Large Data Bases
A survey on wavelet applications in data mining
ACM SIGKDD Explorations Newsletter
One-Pass Wavelet Decompositions of Data Streams
IEEE Transactions on Knowledge and Data Engineering
Transmitting Datacubes over Congested Networks
ITCC '00 Proceedings of the The International Conference on Information Technology: Coding and Computing (ITCC'00)
A comparative study on content-based music genre classification
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Extended wavelets for multiple measures
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A learning-based approach to estimate statistics of operators in continuous queries: a case study
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets
IEEE Transactions on Knowledge and Data Engineering
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data
IEEE Transactions on Knowledge and Data Engineering
Efficient elastic burst detection in data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
The power-method: a comprehensive estimation technique for multi-dimensional queries
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Analysis of predictive spatio-temporal queries
ACM Transactions on Database Systems (TODS)
A multi-dimensional histogram for selectivity estimation and fast approximate query answering
CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
Probabilistic wavelet synopses
ACM Transactions on Database Systems (TODS)
Querying about the Past, the Present, and the Future in Spatio-Temporal Databases
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Compressing historical information in sensor networks
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Journal of Intelligent Information Systems
IEEE Transactions on Knowledge and Data Engineering
Structure choices for two-dimensional histogram construction
CASCON '04 Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research
Fast range query estimation by N-level tree histograms
Data & Knowledge Engineering
Optimization of in-network data reduction
DMSN '04 Proceeedings of the 1st international workshop on Data management for sensor networks: in conjunction with VLDB 2004
Deterministic wavelet thresholding for maximum-error metrics
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Selectivity estimators for multidimensional range queries over real attributes
The VLDB Journal — The International Journal on Very Large Data Bases
Towards a robust query optimizer: a principled and practical approach
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Wavelet synopsis for data streams: minimizing non-euclidean error
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Space efficiency in synopsis construction algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
One-pass wavelet synopses for maximum-error metrics
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Approximation algorithms for wavelet transform coding of data streams
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Wavelet synopses for general error metrics
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2004
Improving range-sum query evaluation on data cubes via polynomial approximation
Data & Knowledge Engineering
Approximation and streaming algorithms for histogram construction problems
ACM Transactions on Database Systems (TODS)
Summarizing level-two topological relations in large spatial datasets
ACM Transactions on Database Systems (TODS)
Graph-based synopses for relational selectivity estimation
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Online summarization of dynamic time series data
The VLDB Journal — The International Journal on Very Large Data Bases
Journal of Intelligent Information Systems
On biased reservoir sampling in the presence of stream evolution
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
GORDIAN: efficient and scalable discovery of composite keys
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Compact histograms for hierarchical identifiers
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
A study on workload-aware wavelet synopses for point and range-sum queries
DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
Efficient range-constrained similarity search on wavelet synopses over multiple streams
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Compressed histograms with arbitrary bucket layouts for selectivity estimation
Information Sciences: an International Journal
Optimal workload-based weighted wavelet synopses
Theoretical Computer Science
Approximate range---sum query answering on data cubes with probabilistic guarantees
Journal of Intelligent Information Systems
Error minimization in approximate range aggregates
Data & Knowledge Engineering
Extended wavelets for multiple measures
ACM Transactions on Database Systems (TODS)
Estimating the selectivity of approximate string queries
ACM Transactions on Database Systems (TODS)
Cardinality estimation using sample views with quality assurance
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Selectivity estimation by batch-query based histogram and parametric method
ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
Inner-product based wavelet synopses for range-sum queries
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
Efficient and effective explanation of change in hierarchical summaries
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Exploiting duality in summarization with deterministic guarantees
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Dissemination of compressed historical information in sensor networks
The VLDB Journal — The International Journal on Very Large Data Bases
A Sketch Algorithm for Estimating Two-Way and Multi-Way Associations
Computational Linguistics
Selectivity estimation of range queries based on data density approximation via cosine series
Data & Knowledge Engineering
Efficient Process of Top-k Range-Sum Queries over Multiple Streams with Minimized Global Error
IEEE Transactions on Knowledge and Data Engineering
MRST: an efficient monitoring technology of summarization on stream data
Journal of Computer Science and Technology
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Tuple routing strategies for distributed eddies
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
SASH: a self-adaptive histogram set for dynamically changing workloads
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Multiscale histograms: summarizing topological relations in large spatial datasets
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
XWAVE: optimal and approximate extended wavelets
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
REHIST: relative error histogram construction algorithms
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Depth estimation for ranking query optimization
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Histograms based on the minimum description length principle
The VLDB Journal — The International Journal on Very Large Data Bases
Hierarchical synopses with optimal error guarantees
ACM Transactions on Database Systems (TODS)
Enhancing histograms by tree-like bucket indices
The VLDB Journal — The International Journal on Very Large Data Bases
Wavelet synopsis for hierarchical range queries with workloads
The VLDB Journal — The International Journal on Very Large Data Bases
Compressed hierarchical binary histograms for summarizing multi-dimensional data
Knowledge and Information Systems
A relational model for XML structural joins and their size estimations
Knowledge and Information Systems
Self-interested database managers playing the view maintenance game
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2
Smooth Interpolating Histograms with Error Guarantees
BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
Approximate Range-Sum Queries over Data Cubes Using Cosine Transform
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
The VLDB Journal — The International Journal on Very Large Data Bases
Proceedings of the VLDB Endowment
Assisting decision making in the event-driven enterprise using wavelets
Decision Support Systems
A new approach to building histogram for selectivity estimation in query processing optimization
Computers & Mathematics with Applications
TuG synopses for approximate query answering
ACM Transactions on Database Systems (TODS)
Sample synopses for approximate answering of group-by queries
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
PROUD: a probabilistic approach to processing similarity queries over uncertain data streams
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Unrestricted wavelet synopses under maximum error bound
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Multiplicative synopses for relative-error metrics
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Depth estimation for ranking query optimization
The VLDB Journal — The International Journal on Very Large Data Bases
On Multidimensional Wavelet Synopses for Maximum Error Bounds
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Sampling-based estimators for subset-based queries
The VLDB Journal — The International Journal on Very Large Data Bases
Hierarchically compressed wavelet synopses
The VLDB Journal — The International Journal on Very Large Data Bases
AMID: Approximation of MultI-measured Data using SVD
Information Sciences: an International Journal
Multi-dimensional data density estimation in P2P networks
Distributed and Parallel Databases
Fast and effective histogram construction
Proceedings of the 18th ACM conference on Information and knowledge management
A wavelet transform for efficient consolidation of sensor relations with quality guarantees
Proceedings of the VLDB Endowment
Optimality and scalability in lattice histogram construction
Proceedings of the VLDB Endowment
Preventing bad plans by bounding the impact of cardinality estimation errors
Proceedings of the VLDB Endowment
Beyond average: toward sophisticated sensing with queries
IPSN'03 Proceedings of the 2nd international conference on Information processing in sensor networks
Getting qualified answers for aggregate queries in spatio-temporal databases
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Building data synopses within a known maximum error bound
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Flexible selection of wavelet coefficients based on the estimation error of predefined queries
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Hierarchically organized skew-tolerant histograms for geographic data objects
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Approximating sliding windows by cyclic tree-like histograms for efficient range queries
Data & Knowledge Engineering
On wavelet decomposition of uncertain time series data sets
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Result-size estimation for information-retrieval subqueries
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Effective processing of continuous group-by aggregate queries in sensor networks
Journal of Systems and Software
ACM Transactions on Database Systems (TODS)
Real-time approximate Range Motif discovery & data redundancy removal algorithm
Proceedings of the 14th International Conference on Extending Database Technology
Beyond simple aggregates: indexing for summary queries
Proceedings of the thirtieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A quad-tree based multiresolution approach for two-dimensional summary data
Information Systems
The VC-dimension of SQL queries and selectivity estimation through sampling
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Efficient construction of histograms for multidimensional data using quad-trees
Decision Support Systems
Building wavelet histograms on large data in MapReduce
Proceedings of the VLDB Endowment
Spatial selectivity estimation using compressed histogram information
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Information Sciences: an International Journal
Fast approximate wavelet tracking on streams
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
HASE: a hybrid approach to selectivity estimation for conjunctive predicates
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Optimal workload-based weighted wavelet synopses
ICDT'05 Proceedings of the 10th international conference on Database Theory
Subquadratic algorithms for workload-aware haar wavelet synopses
FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
Selectivity estimation of high dimensional window queries via clustering
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Change detection in time series data using wavelet footprints
SSTD'05 Proceedings of the 9th international conference on Advances in Spatial and Temporal Databases
Constructing optimal wavelet synopses
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Data stream synopsis using saintetiq
FQAS'06 Proceedings of the 7th international conference on Flexible Query Answering Systems
Improving the accuracy of histograms for geographic data objects
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
A multiresolution volume rendering framework for large-scale time-varying data visualization
VG'05 Proceedings of the Fourth Eurographics / IEEE VGTC conference on Volume Graphics
Wavelet synopsis: setting unselected coefficients to zero is not optimal
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Efficiently adapting graphical models for selectivity estimation
The VLDB Journal — The International Journal on Very Large Data Bases
Selectivity estimation for hybrid queries over text-rich data graphs
Proceedings of the 16th International Conference on Extending Database Technology
CS2: a new database synopsis for query estimation
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Data Quality of Query Results with Generalized Selection Conditions
Operations Research
Efficient and scalable monitoring and summarization of large probabilistic data
Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium
Data & Knowledge Engineering
Optimizing Sample Design for Approximate Query Processing
International Journal of Knowledge-Based Organizations
Hi-index | 0.00 |
Query optimization is an integral part of relational database management systems. One important task in query optimization is selectivity estimation, that is, given a query P, we need to estimate the fraction of records in the database that satisfy P. Many commercial database systems maintain histograms to approximate the frequency distribution of values in the attributes of relations.In this paper, we present a technique based upon a multiresolution wavelet decomposition for building histograms on the underlying data distributions, with applications to databases, statistics, and simulation. Histograms built on the cumulative data distributions give very good approximations with limited space usage. We give fast algorithms for constructing histograms and using them in an on-line fashion for selectivity estimation. Our histograms also provide quick approximate answers to OLAP queries when the exact answers are not required. Our method captures the joint distribution of multiple attributes effectively, even when the attributes are correlated. Experiments confirm that our histograms offer substantial improvements in accuracy over random sampling and other previous approaches.