A survey of interestingness measures for knowledge discovery

  • Authors:
  • Ken McGarry

  • Affiliations:
  • School of Computing and Technology, Informatics Building, University of Sunderland, St Peters Campus, St Peters Way, Sunderland SR6 ODD, UK/ E-mail: ken.mcgarry&commat/sunderland.ac.uk

  • Venue:
  • The Knowledge Engineering Review
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is a well-known fact that the data mining process can generate many hundreds and often thousands of patterns from data. The task for the data miner then becomes one of determining the most useful patterns from those that are trivial or are already well known to the organization. It is therefore necessary to filter out those patterns through the use of some measure of the patterns actual worth. This article presents a review of the available literature on the various measures devised for evaluating and ranking the discovered patterns produced by the data mining process. These so-called interestingness measures are generally divided into two categories: objective measures based on the statistical strengths or properties of the discovered patterns and subjective measures that are derived from the user's beliefs or expectations of their particular problem domain. We evaluate the strengths and weaknesses of the various interestingness measures with respect to the level of user integration within the discovery process.