Range query estimation with data skewness for top-k retrieval

Authors:
Anteneh Ayanso;Paulo B. Goes;Kumar Mehta
Affiliations:
Department of Finance, Operations, and Information Systems, Goodman School of Business, Brock University, 500 Glenridge Avenue, St. Catharines, ON L2S 3A1, Canada;Department of Management Information Systems, Eller College of Management, University of Arizona, 1130 E. Helen Street, Tucson, AZ 85721, USA;Department of Decision Science and MIS, School of Management, George Mason University, 4400 University Drive, Fairfax, VA 22030, USA
Venue:
Decision Support Systems
Year:
2014

Citing 25
Cited 0

VAGUE: a user interface to relational databases that permits vague queries

ACM Transactions on Information Systems (TOIS)
Equi-depth multidimensional histograms

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Relaxing the uniformity and independence assumptions using the concept of fractal dimension

Journal of Computer and System Sciences - Special issue on principles of database systems
Fuzzy queries in multimedia database systems

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Implications of certain assumptions in database performance evauation

ACM Transactions on Database Systems (TODS)
The onion technique: indexing for linear optimization queries

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
PREFER: a system for the efficient execution of multi-parametric ranked queries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Modern Information Retrieval

Modern Information Retrieval
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Top-k selection queries over relational databases: Mapping strategies and performance evaluation

ACM Transactions on Database Systems (TODS)
Accurate estimation of the number of tuples satisfying a condition

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Evaluating Top-k Selection Queries

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Probabilistic Optimization of Top N Queries

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Selectivity Estimation Without the Attribute Value Independence Assumption

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Sampling-Based Estimator for Top-k Query

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Supporting ad-hoc ranking aggregates

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Answering top-k queries using views

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Adaptive rank-aware query optimization in relational databases

ACM Transactions on Database Systems (TODS)
Beyond keyword and cue-phrase matching: a sentence-based abstraction technique for information extraction

Decision Support Systems
On linear mixture of expert approaches to information retrieval

Decision Support Systems
Content-based object organization for efficient image retrieval in image databases

Decision Support Systems
A practical approach for efficiently answering top-k relational queries

Decision Support Systems
A survey of top-k query processing techniques in relational database systems

ACM Computing Surveys (CSUR)
Supporting early pruning in top-k query processing on massive data

Information Processing Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Top-k querying can significantly improve the performance of web-based business intelligence applications such as price comparison and product recommendation systems. Top-k retrieval involves finding a limited number of records in a relational database that are most similar to user-specified attribute-value pairs. This paper extends the cost-based query-mapping method for top-k retrieval by incorporating data skewness in range estimation. Experiments on real world and synthetic multi-attribute data sets show that incorporating data skewness provides a robust performance across different types of data sets, query sets, distance functions, and histograms.