The Panda framework for comparing patterns

Authors:
Ilaria Bartolini;Paolo Ciaccia;Irene Ntoutsi;Marco Patella;Yannis Theodoridis
Affiliations:
DEIS, University of Bologna, viale Risorgimento, 2, 40136 Bologna, Italy;DEIS, University of Bologna, viale Risorgimento, 2, 40136 Bologna, Italy;Department of Informatics, University of Piraeus, Greece and Research Academic Computer Technology Institute, Athens, Greece;DEIS, University of Bologna, viale Risorgimento, 2, 40136 Bologna, Italy;Department of Informatics, University of Piraeus, Greece and Research Academic Computer Technology Institute, Athens, Greece
Venue:
Data & Knowledge Engineering
Year:
2009

Citing 18
Cited 4

WordNet: a lexical database for English

Communications of the ACM
A framework for measuring changes in data characteristics

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Perceptual Metrics for Image Database Navigation

Perceptual Metrics for Image Database Navigation
Searching in metric spaces with user-defined and approximate distances

ACM Transactions on Database Systems (TODS)
MSQL: A Query Language for Database Mining

Data Mining and Knowledge Discovery
Exploiting hierarchical domain structure to compute similarity

ACM Transactions on Information Systems (TOIS)
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases

Proceedings of the 17th International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A perspective on inductive databases

ACM SIGKDD Explorations Newsletter
Windsurf: Region-Based Image Retrieval Using Wavelets

DEXA '99 Proceedings of the 10th International Workshop on Database & Expert Systems Applications
A Sound Algorithm for Region-Based Image Retrieval Using an Index

DEXA '00 Proceedings of the 11th International Workshop on Database and Expert Systems Applications
DEMON: Mining and Monitoring Evolving Data

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
A Metric for Distributions with Applications to Image Databases

ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
A survey of kernels for structured data

ACM SIGKDD Explorations Newsletter
A unified and flexible framework for comparing simple and complex patterns

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Modeling and language support for the management of pattern-bases

Data & Knowledge Engineering
Exact indexing of dynamic time warping

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Scaling-invariant boundary image matching using time-series matching techniques

Data & Knowledge Engineering
MEC --Monitoring Clusters' Transitions

Proceedings of the 2010 conference on STAIRS 2010: Proceedings of the Fifth Starting AI Researchers' Symposium
Bipartite graphs for monitoring clusters transitions

IDA'10 Proceedings of the 9th international conference on Advances in Intelligent Data Analysis
A framework to monitor clusters evolution applied to economy and finance problems

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data Mining techniques are commonly used to extract patterns, like association rules and decision trees, from huge volumes of data. The comparison of patterns is a fundamental issue, which can be exploited, among others, to synthetically measure dissimilarities in evolving or different datasets and to compare the output produced by different data mining algorithms on a same dataset. In this paper, we present the Panda framework for computing the dissimilarity of both simple and complex patterns, defined upon raw data and other patterns, respectively. In Panda the problem of comparing complex patterns is decomposed into simpler sub-problems on the component (simple or complex) patterns and so-obtained partial solutions are then smartly aggregated into an overall dissimilarity score. This intrinsically recursive approach grants Panda with a high flexibility and allows it to easily handle patterns with highly complex structures. Panda is built upon a few basic concepts so as to be generic and clear to the end user. We demonstrate the generality and flexibility of Panda by showing how it can be easily applied to a variety of pattern types, including sets of itemsets and clusterings.