Mining association rules in very large clustered domains

Authors:
Alexandros Nanopoulos;Apostolos N. Papadopoulos;Yannis Manolopoulos
Affiliations:
Department of Informatics, Aristotle University, 54124, Thessaloniki, Greece;Department of Informatics, Aristotle University, 54124, Thessaloniki, Greece;Department of Informatics, Aristotle University, 54124, Thessaloniki, Greece
Venue:
Information Systems
Year:
2007

Citing 24
Cited 5

Merging sorted runs using large main memory

Acta Informatica
Introduction to algorithms

Introduction to algorithms
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Fast discovery of association rules

Advances in knowledge discovery and data mining
Integrating association rule mining with relational database systems: alternatives and implications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Algorithms for association rule mining — a general survey and comparison

ACM SIGKDD Explorations Newsletter
A tree projection algorithm for generation of frequent item sets

Journal of Parallel and Distributed Computing - Special issue on high-performance data mining
Scaling mining algorithms to large databases

Communications of the ACM - Evolving data mining into solutions for insights
Using a Hash-Based Method with Transaction Trimming for Mining Association Rules

IEEE Transactions on Knowledge and Data Engineering
Scalable Algorithms for Association Mining

IEEE Transactions on Knowledge and Data Engineering
A Graph-Based Approach for Discovering Various Types of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Finding Localized Associations in Market Basket Data

IEEE Transactions on Knowledge and Data Engineering
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases

Proceedings of the 17th International Conference on Data Engineering
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery
CLOSET+: searching for the best strategies for mining frequent closed itemsets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Carpenter: finding closed patterns in long biological datasets

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Memory issues in frequent itemset mining

Proceedings of the 2004 ACM symposium on Applied computing
An efficient cluster and decomposition algorithm for mining association rules

Information Sciences—Informatics and Computer Science: An International Journal
Memory-adative association rules mining

Information Systems - Databases: Creation, management and utilization
COBBLER: Combining Column and Row Enumeration for Closed Pattern Discovery

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
FARMER: finding interesting rule groups in microarray datasets

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Microarray gene expression data association rules mining based on BSC-tree and FIS-tree

Data & Knowledge Engineering - Special issue: Biological data management

Representing lattices using many-valued relations

Information Sciences: an International Journal
An efficient graph-based approach to mining association rules for large databases

International Journal of Intelligent Information and Database Systems
Experimental study on fighters behaviors mining

Expert Systems with Applications: An International Journal
Applicability of data mining algorithms for recommendation system in e-learning

Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Weighted association rule mining via a graph based connectivity model

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Emerging applications introduce the requirement for novel association-rule mining algorithms that will be scalable not only with respect to the number of records (number of rows) but also with respect to the domain's size (number of columns). In this paper, we focus on the cases where the items of a large domain correlate with each other in a way that small worlds are formed, that is, the domain is clustered into groups with a large number of intra-group and a small number of inter-group correlations. This property appears in several real-world cases, e.g., in bioinformatics, e-commerce applications, and bibliographic analysis, and can help to significantly prune the search space so as to perform efficient association-rule mining. We develop an algorithm that partitions the domain of items according to their correlations and we describe a mining algorithm that carefully combines partitions to improve the efficiency. Our experiments show the superiority of the proposed method against existing algorithms, and that it overcomes the problems (e.g., increase in CPU cost and possible I/O thrashing) caused by existing algorithms due to the combination of a large domain and a large number of records.