Selecting the right objective measure for association analysis

  • Authors:
  • Pang-Ning Tan;Vipin Kumar;Jaideep Srivastava

  • Affiliations:
  • Department of Computer Science, University of Minnesota, 200 Union Street SE, Minneapolis, MN;Department of Computer Science, University of Minnesota, 200 Union Street SE, Minneapolis, MN;Department of Computer Science, University of Minnesota, 200 Union Street SE, Minneapolis, MN

  • Venue:
  • Information Systems - Knowledge discovery and data mining (KDD 2002)
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective measures such as support, confidence, interest factor, correlation, and entropy are often used to evaluate the interestingness of association patterns. However, in many situations, these measures may provide conflicting information about the interestingness of a pattern. Data mining practitioners also tend to apply an objective measure without realizing that there may be better alternatives available for their application. In this paper, we describe several key properties one should examine in order to select the right measure for a given application. A comparative study of these properties is made using twenty-one measures that were originally developed in diverse fields such as statistics, social science, machine learning, and data mining. We show that depending on its properties, each measure is useful for some application, but not for others. We also demonstrate two scenarios in which many existing measures become consistent with each other, namely, when support-based pruning and a technique known as table standardization are applied. Finally, we present an algorithm for selecting a small set of patterns such that domain experts can find a measure that best fits their requirements by ranking this small set of patterns.