Efficient Rule Retrieval and Postponed Restrict Operations for Association Rule Mining

  • Authors:
  • Jochen Hipp;Christoph Mangold;Ulrich Güntzer;Gholamreza Nakhaeizadeh

  • Affiliations:
  • -;-;-;-

  • Venue:
  • PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Knowledge discovery in databases is a complex, iterative, and highly interactive process. When mining for association rules, typically interactivity is largely smothered by the execution times of the rule generation algorithms. Our approach is to accept a single, possibly expensive run, but all subsequent mining queries are supposed to be answered interactively by accessing a sophisticated rule cache. However there are two critical aspects. First, access to the cache must be efficient and comfortable. Therefore we enrich the basic association mining framework by descriptions of items through application dependent attributes. Furthermore we extend current mining query languages to deal with these attributes through 驴 and 驴 quantifiers. Second, the cache must be prepared to answer a broad variety of queries without rerunning the mining algorithm. A main contribution of this paper is that we show how to postpone restrict operations on the transactions from rule generation to rule retrieval from the cache. That is, without actually rerunning the algorithm, we efficiently construct those rules from the cache that would have been generated if the mining algorithm were run on only a subset of the transactions. In addition we describe how we implemented our ideas on a conventional relational database system. We evaluate our prototype concerning response times in a pilot application at DaimlerChrysler. It turns out to satisfy easily the demands of interactive data mining.