Fibonacci heaps and their uses in improved network optimization algorithms
Journal of the ACM (JACM)
Algorithms for clustering data
Algorithms for clustering data
Faster scaling algorithms for network problems
SIAM Journal on Computing
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
PODS '95 Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Mining quantitative association rules in large relational tables
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Similarity-based queries for time series data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Scalable parallel data mining for association rules
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Fast discovery of association rules
Advances in knowledge discovery and data mining
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A framework for measuring changes in data characteristics
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Learning in graphical models
Efficient progressive sampling
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the most interesting rules
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Algorithms for association rule mining — a general survey and comparison
ACM SIGKDD Explorations Newsletter
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Clustering Algorithms
Advances in Distributed and Parallel Knowledge Discovery
Advances in Distributed and Parallel Knowledge Discovery
A Decomposition Theorem for Maximum Weight Bipartite Matchings
SIAM Journal on Computing
Parallel Algorithms for Discovery of Association Rules
Data Mining and Knowledge Discovery
Finding Interesting Associations without Support Pruning
IEEE Transactions on Knowledge and Data Engineering
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases
Proceedings of the 17th International Conference on Data Engineering
Efficiently Mining Maximal Frequent Itemsets
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Exploiting Dataset Similarity for Distributed Mining
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
CoFD: An Algorithm for Non-distance Based Clustering in High Dimensional Spaces
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
On Similarity Queries for Time-Series Data: Constraint Specification and Implementation
CP '95 Proceedings of the First International Conference on Principles and Practice of Constraint Programming
TIME '97 Proceedings of the 4th International Workshop on Temporal Representation and Reasoning (TIME '97)
Evaluation of Sampling for Data Mining of Association Rules
Evaluation of Sampling for Data Mining of Association Rules
A new distributed data mining model based on similarity
Proceedings of the 2003 ACM symposium on Applied computing
Comparing Datasets Using Frequent Itemsets: Dependency on the Mining Parameters
SETN '08 Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications
CLAP: Collaborative pattern mining for distributed information systems
Decision Support Systems
Hi-index | 0.00 |
This paper proposes a new similarity measure between basket datasets based on associations. The new measure is calculated from support counts using a formula inspired by information entropy. Experiments on both real and synthetic datasets show the effectiveness of the measure. This paper then investigates the applications of the similarity measure. It first studies the problem of finding a mapping between categorical database attribute sets using similarity measures. A generic approach for identifying such a mapping is proposed. The approach is implemented based on the similarity measure proposed in the paper and its performance has been evaluated and validated. Moreover, this paper also explores the applications of using the similarity measure to mine distributed datasets.