Clustering local frequency items in multiple databases

Authors:
Animesh Adhikari
Affiliations:
Department of Computer Science, SP Chowgule College, Margao, Goa 403 602, India
Venue:
Information Sciences: an International Journal
Year:
2013

Citing 34
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A new framework for itemset generation

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Data clustering: a review

ACM Computing Surveys (CSUR)
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques

Data mining: concepts and techniques
BIRCH: A New Data Clustering Algorithm and Its Applications

Data Mining and Knowledge Discovery
Efficient Mining of Association Rules in Distributed Databases

IEEE Transactions on Knowledge and Data Engineering
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Toward Multidatabase Mining: Identifying Relevant Databases

IEEE Transactions on Knowledge and Data Engineering
Synthesizing High-Frequency Rules from Different Data Sources

IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Algorithm for Mining Association Rules in Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Selecting the right interestingness measure for association patterns

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast and Robust General Purpose Clustering Algorithms

Data Mining and Knowledge Discovery
Knowledge Discovery in Multiple Databases

Knowledge Discovery in Multiple Databases
Database classification for multi-database mining

Information Systems
Elements of discrete mathematics (McGraw-Hill computer science series)

Elements of discrete mathematics (McGraw-Hill computer science series)
Synthesizing heavy association rules from different real data sources

Pattern Recognition Letters
Efficient clustering of databases induced by local patterns

Decision Support Systems
Mining conditional patterns in a database

Pattern Recognition Letters
Capturing association among items in a database

Data & Knowledge Engineering
Finding Maximal Fully-Correlated Itemsets in Large Databases

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Efficient Discovery of Confounders in Large Data Sets

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Developing Multi-Database Mining Applications

Developing Multi-Database Mining Applications
Study of select items in different data sources by grouping

Knowledge and Information Systems
A clustering algorithm for multiple data streams based on spectral component similarity

Information Sciences: an International Journal
Rule synthesizing from multiple related databases

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Privacy-preserving hybrid collaborative filtering on cross distributed data

Knowledge and Information Systems
Data mining from multiple heterogeneous relational databases using decision tree classification

Pattern Recognition Letters
High utility pattern mining using the maximal itemset property and lexicographic tree structures

Information Sciences: an International Journal
Mining frequent patterns in a varying-size sliding window of online transactional data streams

Information Sciences: an International Journal
Clustering by analytic functions

Information Sciences: an International Journal
Optimal clustering in the context of overlapping cluster analysis

Information Sciences: an International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

Frequent items could be considered as a basic type of patterns in a database. In the context of multiple data sources, most of the global patterns are based on local frequency items. A multi-branch company transacting from different branches often needs to extract global patterns from data distributed over the branches. Global decisions could be taken effectively using such patterns. Thus, it is important to cluster local frequency items in multiple databases. An overview of the existing measures of association is presented here. For the purpose of selecting the suitable technique of mining multiple databases, we have surveyed the existing multi-database mining techniques. A study on the related clustering techniques is also covered here. The notion of high frequency itemsets is introduced here, and an algorithm for synthesizing supports of such itemsets is designed. The existing clustering technique might cluster local frequency items at a low level, since it estimates association among items in an itemset with a low accuracy, and thus a new algorithm for clustering local frequency items is proposed. Due to the suitability of measure of association A"2, association among items in a high frequency itemset is synthesized based on it. The soundness of the clustering technique has been shown. Numerous experiments are conducted using five datasets, and the results on different aspects of the proposed problem are presented in the experimental section. The effectiveness of the proposed clustering technique is more visible in dense databases.