Implicit enumeration of patterns

Authors:
Taneli Mielikäinen
Affiliations:
HIIT Basic Research Unit, Department of Computer Science, University of Helsinki, Finland
Venue:
KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Year:
2004

Citing 25
Cited 2

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A database perspective on knowledge discovery

Communications of the ACM
Fast discovery of association rules

Advances in knowledge discovery and data mining
Inductive databases and condensed representations for data mining (extended abstract)

ILPS '97 Proceedings of the 1997 international symposium on Logic programming
Arithmetic coding revisited

ACM Transactions on Information Systems (TOIS)
SPADE: an efficient algorithm for mining frequent sequences

Machine Learning
A condensed representation to find frequent patterns

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Mining frequent patterns with counting inference

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries

Data Mining and Knowledge Discovery
Complete Mining of Frequent Patterns from Graphs: Mining Graph Data

Machine Learning
Mining Sequential Patterns with Regular Expression Constraints

IEEE Transactions on Knowledge and Data Engineering
Finding Patterns in Three-Dimensional Graphs: Algorithms and Applications to Scientific Data Mining

IEEE Transactions on Knowledge and Data Engineering
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Concise Representation of Frequent Patterns Based on Disjunction-Free Generators

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining All Non-derivable Frequent Itemsets

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
The Item-Set Tree: A Data Structure for Data Mining

DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
A perspective on inductive databases

ACM SIGKDD Explorations Newsletter
Selecting the right interestingness measure for association patterns

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovering all most specific sentences

ACM Transactions on Database Systems (TODS)
On Computing Condensed Frequent Pattern Bases

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Discovering Frequent Geometric Subgraphs

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Beyond Independence: Probabilistic Models for Query Approximation on Binary Transaction Data

IEEE Transactions on Knowledge and Data Engineering
CloseGraph: mining closed frequent graph patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
An automata approach to pattern collections

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases

An automata approach to pattern collections

KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Transaction databases, frequent itemsets, and their condensed representations

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Condensed representations of pattern collections have been recognized to be important building blocks of inductive databases, a promising theoretical framework for data mining, and recently they have been studied actively. However, there has not been much research on how condensed representations should actually be represented. In this paper we study implicit enumeration of patterns, i.e., how to represent pattern collections by listing only the interestingness values of the patterns. The main problem is that the pattern classes are typically huge compared to the collections of interesting patterns in them. We solve this problem by choosing a good ordering of listing the patterns in the class such that the ordering admits effective pruning and prediction of the interestingness values of the patterns. This representation of interestingness values enables us to quantify how surprising a pattern is in the collection. Furthermore, the encoding of the interestingness values reflects our understanding of the pattern collection. Thus the size of the encoding can be used to evaluate the correctness of our assumptions about the pattern collection and the interestingness measure.