Approximate Query Answering In Numerical Databases

Authors:
Nabil I. Hachem;Chenye Bao;Steve Taylor
Affiliations:
-;-;-
Venue:
SSDBM '96 Proceedings of the Eighth International Conference on Scientific and Statistical Database Management
Year:
1996

Citing 15
Cited 0

Equi-depth multidimensional histograms

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Statistical profile estimation in database systems

ACM Computing Surveys (CSUR)
Processing aggregate relational queries with hard time constraints

SIGMOD '89 Proceedings of the 1989 ACM SIGMOD international conference on Management of data
Random sampling from B+ trees

VLDB '89 Proceedings of the 15th international conference on Very large data bases
Practical selectivity estimation through adaptive sampling

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Error-constrained COUNT query evaluation in relational databases

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Sequential sampling procedures for query size estimation

SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Query evaluation techniques for large databases

ACM Computing Surveys (CSUR)
An instant and accurate size estimation method for joins and selections in a retrieval-intensive environment

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Adaptive selectivity estimation using query feedback

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Statistical estimators for relational algebra expressions

Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Query Optimization in Database Systems

ACM Computing Surveys (CSUR)
Database evaluation using multiple regression techniques

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Accurate estimation of the number of tuples satisfying a condition

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Simple Random Sampling from Relational Databases

VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work addresses the problem of efficient processing of queries in very large numerical databases. Previous focus has been on the design of index structures for the efficient access of data. Recently more and more statistical methods have been used in query optimization. Those methods approximate the distribution of the attribute values to estimate the selectivity of query results. A methodology that uses regression techniques to approximate the actual attribute values is introduced. Through analysis of the data, one derives a set of characteristic functions to form a ``regression database,'' a compressed image of the original database. Based on these functions, approximate answers to queries may be provided within a pre-specified tolerable error, but without the expensive search overhead usually inherent with the use of indexing techniques. A framework to build regression databases is proposed. An experimental prototype is implemented to evaluate the technique in terms of realizability, efficiency and practicality. This technique is complementary to conventional approaches and to statistical methods.