Antisampling for Estimation: An Overview
IEEE Transactions on Software Engineering
Simple Random Sampling from Relational Databases
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Computer based management information systems embodying answer accuracy as a user parameter
Computer based management information systems embodying answer accuracy as a user parameter
Physical database support for scientific and statistical database management
SSDBM'86 Proceedings of the 3rd international workshop on Statistical and scientific database management
Processing aggregate relational queries with hard time constraints
SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Estimating the size of generalized transitive closures
VLDB '89 Proceedings of the 15th international conference on Very large data bases
VLDB '89 Proceedings of the 15th international conference on Very large data bases
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Random sampling from hash files
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Statistical estimators for aggregate relational algebra queries
ACM Transactions on Database Systems (TODS)
Error-constrained COUNT query evaluation in relational databases
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Sequential sampling procedures for query size estimation
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Processing time-constrained aggregate queries in CASE-DB
ACM Transactions on Database Systems (TODS)
Multiple join size estimation by virtual domains (extended abstract)
PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Fixed-precision estimation of join selectivity
PODS '93 Proceedings of the twelfth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Predicate migration: optimizing queries with expensive predicates
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Using statistical sampling for query optimization in heterogeneous library information systems
CSC '93 Proceedings of the 1993 ACM conference on Computer science
On the relative cost of sampling for join selectivity estimation
PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Adaptive selectivity estimation using query feedback
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Bifocal sampling for skew-resistant join size estimation
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Query execution techniques for caching expensive methods
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Distributed processing of time-constrained queries in CASE-DB
CIKM '96 Proceedings of the fifth international conference on Information and knowledge management
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Random sampling for histogram construction: how much is enough?
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Optimization techniques for queries with expensive methods
ACM Transactions on Database Systems (TODS)
Query size estimation by adaptive sampling (extended abstract)
PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Tracking join and self-join sizes in limited storage
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Ripple joins for online aggregation
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Uniform generation in spatial constraint databases and applications (Extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Towards estimation error guarantees for distinct values
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Effective Query Size Estimation Using Neural Networks
Applied Intelligence
Informix under CONTROL: Online Query Processing
Data Mining and Knowledge Discovery
Approximate Query Answering Using Data Warehouse Striping
Journal of Intelligent Information Systems - Special issue on data warehousing and knowledge discovery
Time-Constrained Query Processing in CASE-DB
IEEE Transactions on Knowledge and Data Engineering
Online Feedback for Nested Aggregate Queries with Multi-Threading
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
An Evaluation of Non-Equijoin Algorithms
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports
Proceedings of the 27th International Conference on Very Large Data Bases
Approximate Query Processing: Taming the TeraBytes
Proceedings of the 27th International Conference on Very Large Data Bases
Random Sampling from Pseudo-Ranked B+ Trees
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Approximate Query Answering In Numerical Databases
SSDBM '96 Proceedings of the Eighth International Conference on Scientific and Statistical Database Management
Progressive evaluation of nested aggregate queries
The VLDB Journal — The International Journal on Very Large Data Bases
Containment join size estimation: models and methods
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
An integrated method for estimating selectivities in a multidatabase system
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Query Size Estimation for Joins Using Systematic Sampling
Distributed and Parallel Databases
A bi-level Bernoulli scheme for database sampling
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Effective use of block-level sampling in statistics estimation
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A disk-based join with probabilistic guarantees
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Towards estimating the number of distinct value combinations for a set of attributes
Proceedings of the 14th ACM international conference on Information and knowledge management
Random sampling from database files: a survey
SSDBM'1990 Proceedings of the 5th international conference on Statistical and Scientific Database Management
Database systems for programmable logic controllers
SSDBM'1990 Proceedings of the 5th international conference on Statistical and Scientific Database Management
Precision-time tradeoffs: a paradigm for processing statistical queries on databases
SSDBM'1988 Proceedings of the 4th international conference on Statistical and Scientific Database Management
Uniform generation in spatial constraint databases and applications
Journal of Computer and System Sciences
ACM Transactions on Database Systems (TODS)
Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more
Why go logarithmic if we can go linear?: Towards effective distinct counting of search traffic
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Confidence bounds for sampling-based group by estimates
ACM Transactions on Database Systems (TODS)
Distinct value estimation on peer-to-peer networks
Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments
New join operator definitions for sensor network databases
AEE'07 Proceedings of the 6th conference on Applications of electrical engineering
A sampling approach for XML query selectivity estimation
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Progressive Evaluation of XML Queries for Online Aggregation and Progress Indicator
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Multimedia selection operation placement
Multimedia Tools and Applications
The VC-dimension of SQL queries and selectivity estimation through sampling
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Effective stratification for low selectivity queries on deep web data sources
Proceedings of the 20th ACM international conference on Information and knowledge management
Practical algorithms for tracking database join sizes
FSTTCS '05 Proceedings of the 25th international conference on Foundations of Software Technology and Theoretical Computer Science
Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
Foundations and Trends in Databases
CS2: a new database synopsis for query estimation
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Hi-index | 0.00 |
Present database systems process all the data related to a query before giving out responses. As a result, the size of the data to be processed becomes excessive for real-time/time-constrained environments. A new methodology is needed to cut down systematically the time to process the data involved in processing the query. To this end, we propose to use data samples and construct an approximate synthetic response to a given query.In this paper, we consider only COUNT(E) type queries, where E is an arbitrary relational algebra expression. We make no assumptions about the distribution of attribute values and ordering of tuples in the input relations, and propose consistent and unbiased estimators for arbitrary COUNT(E) type queries. We design a sampling plan based on the cluster sampling method to improve the utilization of sampled data and to reduce the cost of sampling. We also evaluate the performance of the proposed estimators.