One in a million: picking the right patterns

  • Authors:
  • Björn Bringmann;Albrecht Zimmermann

  • Affiliations:
  • Katholieke Universiteit Leuven, Departement Computerwetenschappen, Celestijnenlaan 200a, 3001, Heverlee, Belgium;Katholieke Universiteit Leuven, Departement Computerwetenschappen, Celestijnenlaan 200a, 3001, Heverlee, Belgium

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Constrained pattern mining extracts patterns based on their individual merit. Usually this results in far more patterns than a human expert or a machine leaning technique could make use of. Often different patterns or combinations of patterns cover a similar subset of the examples, thus being redundant and not carrying any new information. To remove the redundant information contained in such pattern sets, we propose two general heuristic algorithms—Bouncer and Picker—for selecting a small subset of patterns. We identify several selection techniques for use in this general algorithm and evaluate those on several data sets. The results show that both techniques succeed in severely reducing the number of patterns, while at the same time apparently retaining much of the original information. Additionally, the experiments show that reducing the pattern set indeed improves the quality of classification results. Both results show that the developed solutions are very well suited for the goals we aim at.