Clustering transactions with an unbalanced hierarchical product structure

Authors:
MinTzu Wang;PingYu Hsu;K. C. Lin;ShiuannShuoh Chen
Affiliations:
Department of Business Administration, National Central University, Jhongli and Department of Information Management, Technology and Science Institute of Northern, Taiwan, R. O.C.;Department of Business Administration, National Central University, Jhongli, Taiwan, R. O.C.;Department of Management Information Systems, National Chung Hsing University, Taichung, Taiwan R. O.C.;Department of Business Administration, National Central University, Jhongli, Taiwan, R. O.C.
Venue:
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Year:
2007

Citing 14
Cited 0

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Using collaborative filtering to weave an information tapestry

Communications of the ACM - Special issue on information filtering
An algorithmic framework for performing collaborative filtering

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Data mining: concepts and techniques

Data mining: concepts and techniques
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Data Mining: Concepts, Models, Methods and Algorithms

Data Mining: Concepts, Models, Methods and Algorithms
Data Mining: An Overview from a Database Perspective

IEEE Transactions on Knowledge and Data Engineering
Mining Multiple-Level Association Rules in Large Databases

IEEE Transactions on Knowledge and Data Engineering
Exploiting hierarchical domain structure to compute similarity

ACM Transactions on Information Systems (TOIS)
Knowledge Discovery in Databases: An Attribute-Oriented Approach

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Selecting the right objective measure for association analysis

Information Systems - Knowledge discovery and data mining (KDD 2002)
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The datasets extracted from large retail stores often contain sparse information composed of a huge number of items and transactions, with each transaction only containing a few items. These data render basket analysis with extremely low item support, customer clustering with large intra cluster distance and transaction classifications having huge classification trees. Although a similarity measure represented by counting the depth of the least common ancestor normalized by the depth of the concept tree lifts the limitation of binary equality, it produces counter intuitive results when the concept hierarchy is unbalanced since two items in deeper subtrees are very likely to have a higher similarity than two items in shallower subtrees. The research proposes to calculate the distance between two items by counting the edge traversal needed to link them in order to solve the issues. The method is straight forward yet achieves better performance with retail store data when concept hierarchy is unbalanced.