The Panda framework for comparing patterns

  • Authors:
  • Ilaria Bartolini;Paolo Ciaccia;Irene Ntoutsi;Marco Patella;Yannis Theodoridis

  • Affiliations:
  • DEIS, University of Bologna, viale Risorgimento, 2, 40136 Bologna, Italy;DEIS, University of Bologna, viale Risorgimento, 2, 40136 Bologna, Italy;Department of Informatics, University of Piraeus, Greece and Research Academic Computer Technology Institute, Athens, Greece;DEIS, University of Bologna, viale Risorgimento, 2, 40136 Bologna, Italy;Department of Informatics, University of Piraeus, Greece and Research Academic Computer Technology Institute, Athens, Greece

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data Mining techniques are commonly used to extract patterns, like association rules and decision trees, from huge volumes of data. The comparison of patterns is a fundamental issue, which can be exploited, among others, to synthetically measure dissimilarities in evolving or different datasets and to compare the output produced by different data mining algorithms on a same dataset. In this paper, we present the Panda framework for computing the dissimilarity of both simple and complex patterns, defined upon raw data and other patterns, respectively. In Panda the problem of comparing complex patterns is decomposed into simpler sub-problems on the component (simple or complex) patterns and so-obtained partial solutions are then smartly aggregated into an overall dissimilarity score. This intrinsically recursive approach grants Panda with a high flexibility and allows it to easily handle patterns with highly complex structures. Panda is built upon a few basic concepts so as to be generic and clear to the end user. We demonstrate the generality and flexibility of Panda by showing how it can be easily applied to a variety of pattern types, including sets of itemsets and clusterings.