Applying Objective Interestingness Measures in Data Mining Systems

  • Authors:
  • Robert J. Hilderman;Howard J. Hamilton

  • Affiliations:
  • -;-

  • Venue:
  • PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the most important steps in any knowledge discovery task is the interpretation and evaluation of discovered patterns. To address this problem, various techniques, such as the chi-square test for independence, have been suggested to reduce the number of patterns presented to the user and to focus attention on those that are truly statistically significant. However, when mining a large database, the number of patterns discovered can remain large even after adjusting significance thresholds to eliminate spurious patterns. What is needed, then, is an effective measure to further assist in the interpretation and evaluation step that ranks the interestingness of the remaining patterns prior to presenting them to the user. In this paper, we describe a two-step process for ranking the interestingness of discovered patterns that utilizes the chi-square test for independence in the first step and objective measures of interestingness in the second step. We show how this two-step process can be applied to ranking characterized/generalized association rules and data cubes.