Automatic discovery of locally frequent itemsets in the presence of highly frequent itemsets

Authors:
Ferenc Bodon;Ioannis N. Kouris;Christos H. Makris;Athanasios K. Tsakalidis
Affiliations:
Informatics Laboratory, Computer and Automation Research Institute, Hungarian Academy of Sciences and Department of Computer Science and Information Theory, Budapest University of Technology and E ...;Department of Computer Engineering and Informatics, University of Patras, School of Engineering, 26500 Patras, Hellas, Greece and Computer Technology Institute, P.O. BOX 1192, 26110 Patras, Hellas ...;Department of Computer Engineering and Informatics, University of Patras, School of Engineering, 26500 Patras, Hellas, Greece and Computer Technology Institute, P.O. BOX 1192, 26110 Patras, Hellas ...;Department of Computer Engineering and Informatics, University of Patras, School of Engineering, 26500 Patras, Hellas, Greece and Computer Technology Institute, P.O. BOX 1192, 26110 Patras, Hellas ...
Venue:
Intelligent Data Analysis
Year:
2005

Citing 18
Cited 2

Introduction to object-oriented databases

Introduction to object-oriented databases
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficient parallel data mining for association rules

CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
An effective hash-based algorithm for mining association rules

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A new framework for itemset generation

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Using association rules for product assortment decisions: a case study

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining association rules with multiple minimum supports

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
A fast distributed algorithm for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Mining Frequent Itemsets Using Support Constraints

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases

Ethical aspects of web log data mining

International Journal of Information Technology and Management
Using ontologies to facilitate post-processing of association rules by domain experts

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many alternatives have been proposed for the mining of association rules involving rare but 'interesting' itemsets in a dataset where there also exist highly frequent itemsets. Nevertheless, all the approaches thus far suggested that we knew which those interesting itemsets are, as well as which is the right support value for them. None of the approaches proposed a way of automatically discovering such items. In this work we introduce the notion of locally frequent itemsets and support their existence as the biggest and most frequently appearing category of rare but interesting itemsets especially at commercial applications, based on the opinion of field experts. Subsequently we propose two algorithms for finding and handling these itemsets. The main idea is to divide the database into partitions according to the problem needs and besides searching for itemsets which are frequent in the whole database to search also for itemsets which are frequent if considered within these partitions. Our approach proves very effective and also very efficient as compared to the traditional algorithms both in synthetic and real data.