Is pushing constraints deeply into the mining algorithms really what we want?: an alternative approach for association rule mining

Authors:
Jochen Hipp;Ulrich Güntzer
Affiliations:
DaimlerChrysler AG, Research & Technology, Ulm, Germany;University of Tübingen, Germany
Venue:
ACM SIGKDD Explorations Newsletter
Year:
2002

Citing 14
Cited 18

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
The KDD process for extracting useful knowledge from volumes of data

Communications of the ACM
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The process of knowledge discovery in databases

Advances in knowledge discovery and data mining
Exploratory mining via constrained frequent set queries

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Mining the most interesting rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Algorithms for association rule mining — a general survey and comparison

ACM SIGKDD Explorations Newsletter
Empirical bayes screening for multi-item associations

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
DMajor—Application Programming Interface for Database Mining

Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A New SQL-like Operator for Mining Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient Rule Retrieval and Postponed Restrict Operations for Association Rule Mining

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Constraint-Based Rule Mining in Large, Dense Databases

ICDE '99 Proceedings of the 15th International Conference on Data Engineering

How to quickly find a witness

Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A fuzzy logic based method to acquire user threshold of minimum-support for mining association rules

Information Sciences—Informatics and Computer Science: An International Journal
Simultaneous optimization of complex mining tasks with a knowledgeable cache

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
A framework to support multiple query optimization for complex mining tasks

MDM '05 Proceedings of the 6th international workshop on Multimedia data mining: mining integrated media and complex data
Soft constraint based pattern mining

Data & Knowledge Engineering
Interactive visual exploration of association rules with rule-focusing methodology

Knowledge and Information Systems
Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support

Expert Systems with Applications: An International Journal
An efficient data mining approach for discovering interesting knowledge from customer transactions

Expert Systems with Applications: An International Journal
Summary queries for frequent itemsets mining

Journal of Systems and Software
Extending the soft constraint based mining paradigm

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Using a reinforced concept lattice to incrementally mine association rules from closed itemsets

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
UFRGS@CLEF2008: using association rules for cross-language information retrieval

CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Constraint relaxations for discovering unknown sequential patterns

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Ontology-based filtering mechanisms for web usage patterns retrieval

EC-Web'05 Proceedings of the 6th international conference on E-Commerce and Web Technologies
The hows, whys, and whens of constraints in itemset and rule discovery

Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases
How to quickly find a witness

Proceedings of the 2004 European conference on Constraint-Based Mining and Inductive Databases
Ontology-Based rummaging mechanisms for the interpretation of web usage patterns

EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Interactive association rules discovery

ICFCA'06 Proceedings of the 4th international conference on Formal Concept Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

The common approach to exploit mining constraints is to push them deeply into the mining algorithms. In our paper we argue that this approach is based on an understanding of KDD that is no longer up-to-date. In fact, today KDD is seen as a human centered, highly interactive and iterative process. Blindly enforcing constraints already during the mining runs neglects the process character of KDD and therefore is no longer state of the art. Constraints can make a single algorithm run faster but in fact we are still far from response times that would allow true interactivity in KDD. In addition we pay the price of repeated mining runs and moreover risk reducing data mining to some kind of hypothesis testing. Taking all the above into consideration we propose to do exactly the contrary of constrained mining: We accept an initial (nearly) unconstrained and costly mining run. But instead of a sequence of subsequent and still expensive constrained mining runs we answer all further mining queries from this initial result set. Whereas this is straight forward for constraints that can be implemented as filters on the result set, things get more complicated when we restrict the underlying mining data. Actually in practice such constraints are very important, e.g. the generation of rules for certain days of the week, for families, singles, male or female customers etc. We show how to postpone such row-restriction constraints on the transactions from rule generation to rule retrieval from the initial result set.