Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution

  • Authors:
  • Hui Xiong;Pang-Ning Tan;Vipin Kumar

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Existing association-rule mining algorithms often relyon the support-based pruning strategy to prune its combinatorialsearch space. This strategy is not quite effectivefor data sets with skewed support distributions because theytend to generate many spurious patterns involving itemsfrom different support levels or miss potentially interestinglow-support patterns. To overcome these problems, we proposethe concept of hyperclique pattern, which uses an objectivemeasure called h-confidence to identify strong affinitypatterns. We also introduce the novel concept of cross-supportproperty for eliminating patterns involving itemswith substantially different support levels. Our experimentalresults demonstrate the effectiveness of this method forfinding patterns in dense data sets even at very low supportthresholds, where most of the existing algorithms wouldbreak down. Finally, hyperclique patterns also show greatpromise for clustering items in high dimensional space.