Frequent pattern discovery without binarization: mining attribute profiles

Authors:
Attila Gyenesei;Ralph Schlapbach;Etzard Stolte;Ulrich Wagner
Affiliations:
Knowledge and Data Analysis, Unilever Food and Health Research Institute, Vlaardingen, AC, The Netherlands;Functional Genomics Center Zürich, Uni ETH Zürich, Zürich, Switzerland;Knowledge and Data Analysis, Unilever Food and Health Research Institute, Vlaardingen, AC, The Netherlands;Functional Genomics Center Zürich, Uni ETH Zürich, Zürich, Switzerland
Venue:
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Year:
2006

Citing 11
Cited 0

Mining quantitative association rules in large relational tables

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Algorithms for association rule mining — a general survey and comparison

ACM SIGKDD Explorations Newsletter
Levelwise Search and Borders of Theories in KnowledgeDiscovery

Data Mining and Knowledge Discovery
Scalable Algorithms for Association Mining

IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining gene expression data for positive and negative co-regulated gene clusters

Bioinformatics
Analyzing microarray data using quantitative association rules

Bioinformatics
Towards ad-hoc rule semantics for gene expression data

ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent pattern discovery has become a popular solution to many scientific and industrial problems in a range of different datasets. Traditional algorithms, developed for binary (or Boolean) attributes, can be applied to such data with a prerequisite of transforming non-binary (continuous or categorical) attribute domains into binary ones. As a consequence of this binarization, the discovered patterns no longer reflect the associations between attributes but the relations between their binned independent values, and thus, interactions between the original attributes may be lost. In this paper we propose to overcome this limitation by introducing the concept of mining frequent attribute profiles that describes the relationships between the original attributes. By this concept, previously hidden interactions can be discovered and redundant patterns that are identified by traditional methods are eliminated. A novel algorithm, called MAP, has been developed for mining attribute profiles that can be potentially applied to diverse data domains. The effectiveness of the proposed method is shown by using gene expression or microarray data.