Equi-depth multidimensional histograms
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Statistical profile estimation in database systems
ACM Computing Surveys (CSUR)
Processing aggregate relational queries with hard time constraints
SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
VLDB '89 Proceedings of the 15th international conference on Very large data bases
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Error-constrained COUNT query evaluation in relational databases
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Sequential sampling procedures for query size estimation
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Adaptive selectivity estimation using query feedback
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Statistical estimators for relational algebra expressions
Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Query Optimization in Database Systems
ACM Computing Surveys (CSUR)
Database evaluation using multiple regression techniques
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Simple Random Sampling from Relational Databases
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Hi-index | 0.00 |
This work addresses the problem of efficient processing of queries in very large numerical databases. Previous focus has been on the design of index structures for the efficient access of data. Recently more and more statistical methods have been used in query optimization. Those methods approximate the distribution of the attribute values to estimate the selectivity of query results. A methodology that uses regression techniques to approximate the actual attribute values is introduced. Through analysis of the data, one derives a set of characteristic functions to form a ``regression database,'' a compressed image of the original database. Based on these functions, approximate answers to queries may be provided within a pre-specified tolerable error, but without the expensive search overhead usually inherent with the use of indexing techniques. A framework to build regression databases is proposed. An experimental prototype is implemented to evaluate the technique in terms of realizability, efficiency and practicality. This technique is complementary to conventional approaches and to statistical methods.