Exploratory data analysis leading towards the most interesting simple association rules

Authors:
Alfonso Iodice D'Enza;Francesco Palumbo;Michael Greenacre
Affiliations:
Dipartimento di Matematica e Statistica, Universití di Napoli Federico II, Italy;Dipartimento di Istituzioni Economiche e Finanziarie, Universití di Macerata, Via Crescimbeni, 20 I-62100 Macerata, Italy;Department of Economics and Business, Universitat Pompeu Fabra, Barcelona, Spain
Venue:
Computational Statistics & Data Analysis
Year:
2008

Citing 6
Cited 2

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A statistical perspective on knowledge discovery in databases

Advances in knowledge discovery and data mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Selecting the right interestingness measure for association patterns

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering binary data streams with K-means

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Cost-efficient mining techniques for data streams

ACSW Frontiers '04 Proceedings of the second workshop on Australasian information security, Data Mining and Web Intelligence, and Software Internationalisation - Volume 32

Editorial: Some Recent Trends in Applied Stochastic Modeling and Multidimensional Data Analysis

Computational Statistics & Data Analysis
A conceptual model for the implementation of an Inter-Knowledge Objects Exchange System (IKOES) in automotive industry

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.03

Visualization

Abstract

Association rules (AR) represent one of the most powerful and largely used approaches to detect the presence of regularities and paths in large databases. Rules express the relations (in terms of co-occurrence) between pairs of items and are defined in two measures: support and confidence. Most techniques for finding AR scan the whole data set, evaluate all possible rules and retain only rules that have support and confidence greater than thresholds, which should be fixed in order to avoid both that only trivial rules are retained and also that interesting rules are not discarded. A multistep approach aims to the identification of potentially interesting items exploiting well-known techniques of multidimensional data analysis. In particular, interesting pairs of items have a well-defined degree of association: an item pair is well defined if its degree of co-occurrence is very high with respect to one or more subsets of the considered set of transactions.