A formal analysis of why heuristic functions work

Authors:
B. John Oommen;Luis G. Rueda
Affiliations:
Senior Member, IEEE. School of Computer Science, Carleton University, 1125 Colonel By Dr., Ottawa, ON, K1S 5B6, Canada;School of Computer Science, University of Windsor, 401 Sunset Ave., Windsor, ON, N9B 3P4, Canada
Venue:
Artificial Intelligence
Year:
2005

Citing 21
Cited 5

Equi-depth multidimensional histograms

SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing

Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing
Statistical profile estimation in database systems

ACM Computing Surveys (CSUR)
Learning automata: an introduction

Learning automata: an introduction
Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
How computers play chess

How computers play chess
On the propagation of errors in the size of join results

SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Histogram-based estimation techniques in database systems

Histogram-based estimation techniques in database systems
Artificial intelligence: a new synthesis

Artificial intelligence: a new synthesis
An Introduction to Genetic Algorithms

An Introduction to Genetic Algorithms
Tabu Search

Tabu Search
Interactive Linear Algebra with Maple V

Interactive Linear Algebra with Maple V
Artificial Intelligence

Artificial Intelligence
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Accurate estimation of the number of tuples satisfying a condition

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Graph Partitioning Using Learning Automata

IEEE Transactions on Computers
Query Result Size Estimation Using the Trapezoidal Attribute Cardinality Map

IDEAS '00 Proceedings of the 2000 International Symposium on Database Engineering & Applications
Query Result Size Estimation Using a Novel Histogram-like Technique: The Rectangular Attribute Cardinality Map

IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
The optimization of queries in relational databases

The optimization of queries in relational databases
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)

On Utilizing Search Methods to Select Subspace Dimensions for Kernel-Based Nonlinear Subspace Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Hill-Climbing Approach for Automatic Gridding of cDNA Microarray Images

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
The averaged mappings problem: statement, applications, and approximate solution

Proceedings of the 44th annual Southeast regional conference
Goal-oriented optimal subset selection of correlated multimedia streams

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Association-based dynamic computation of reputation in web services

International Journal of Web and Grid Services

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many optimization problems in computer science have been proven to be NP-hard, and it is unlikely that polynomial-time algorithms that solve these problems exist unless P = NP. Alternatively, they are solved using heuristics algorithms, which provide a sub-optimal solution that, hopefully, is arbitrarily close to the optimal. Such problems are found in a wide range of applications, including artificial intelligence, game theory, graph partitioning, database query optimization, etc. Consider a heuristic algorithm, A. Suppose that A could invoke one of two possible heuristic functions. The question of determining which heuristic function is superior, has typically demanded a yes/no answer--one which is often substantiated by empirical evidence. In this paper, by using Pattern Classification Techniques (PCT), we propose a formal, rigorous theoretical model that provides a stochastic answer to this problem. We prove that given a heuristic algorithm, A, that could utilize either of two heuristic functions H1 or H2 used to find the solution to a particular problem, if the accuracy of evaluating the cost of the optimal solution by using H1 is greater than the accuracy of evaluating the cost using H2, then H1 has a higher probability than H2 of leading to the optimal solution. This unproven conjecture has been the basis for designing numerous algorithms such as the A* algorithm, and its variants. Apart from formally proving the result, we also address the corresponding database query optimization problem that has been open for at least two decades. To validate our proofs, we report empirical results on database query optimization techniques involving a few well-known histogram estimation methods.