Levelwise Search and Borders of Theories in KnowledgeDiscovery

  • Authors:
  • Heikki Mannila;Hannu Toivonen

  • Affiliations:
  • Department of Computer Science, P.O. Box 26, FIN–00014 University of Helsinki, Finland.;Department of Computer Science, P.O. Box 26, FIN–00014 University of Helsinki, Finland.

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the basic problems in knowledge discovery in databases (KDD)is the following: given a data set r, a class L of sentences for definingsubgroups of r, and a selection predicate, find all sentences of L deemedinteresting by the selection predicate. We analyze the simple levelwisealgorithm for finding all such descriptions. We give bounds for the numberof database accesses that the algorithm makes. For this, we introduce theconcept of the border of a theory, a notion that turns out to besurprisingly powerful in analyzing the algorithm. We also consider theverification problem of a KDD process: given r and a set of sentences S⊆ L determine whether S is exactly the set of interesting statementsabout r. We show strong connections between the verification problem and thehypergraph transversal problem. The verification problem arises in a naturalway when using sampling to speed up the pattern discovery step in KDD.