Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Using collaborative filtering to weave an information tapestry
Communications of the ACM - Special issue on information filtering
An algorithmic framework for performing collaborative filtering
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Data mining: concepts and techniques
Data mining: concepts and techniques
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Data Mining: Concepts, Models, Methods and Algorithms
Data Mining: Concepts, Models, Methods and Algorithms
Data Mining: An Overview from a Database Perspective
IEEE Transactions on Knowledge and Data Engineering
Mining Multiple-Level Association Rules in Large Databases
IEEE Transactions on Knowledge and Data Engineering
Exploiting hierarchical domain structure to compute similarity
ACM Transactions on Information Systems (TOIS)
Knowledge Discovery in Databases: An Attribute-Oriented Approach
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Generalized Association Rules
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Selecting the right objective measure for association analysis
Information Systems - Knowledge discovery and data mining (KDD 2002)
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Hi-index | 0.00 |
The datasets extracted from large retail stores often contain sparse information composed of a huge number of items and transactions, with each transaction only containing a few items. These data render basket analysis with extremely low item support, customer clustering with large intra cluster distance and transaction classifications having huge classification trees. Although a similarity measure represented by counting the depth of the least common ancestor normalized by the depth of the concept tree lifts the limitation of binary equality, it produces counter intuitive results when the concept hierarchy is unbalanced since two items in deeper subtrees are very likely to have a higher similarity than two items in shallower subtrees. The research proposes to calculate the distance between two items by counting the edge traversal needed to link them in order to solve the issues. The method is straight forward yet achieves better performance with retail store data when concept hierarchy is unbalanced.