Clustering Transactional Data

Authors:
Fosca Giannotti;Cristian Gozzi;Giuseppe Manco
Affiliations:
-;-;-
Venue:
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Year:
2002

Citing 7
Cited 8

A cutting plane algorithm for a clustering problem

Mathematical Programming: Series A and B
Clustering for edge-cost minimization (extended abstract)

STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Data mining: concepts and techniques

Data mining: concepts and techniques
Concept decompositions for large sparse text data using clustering

Machine Learning
Grouping Web Page References into Transactions for Mining World Wide Web Browsing Patterns

KDEX '97 Proceedings of the 1997 IEEE Knowledge and Data Engineering Exchange Workshop
ROCK: A Robust Clustering Algorithm for Categorical Attributes

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Fast algorithms for mining association rules and sequential patterns

Fast algorithms for mining association rules and sequential patterns

Similarity-based clustering of Web transactions

Proceedings of the 2003 ACM symposium on Applied computing
TCSOM: Clustering Transactions Using Self-Organizing Map

Neural Processing Letters
Top-Down Parameter-Free Clustering of High-Dimensional Categorical Data

IEEE Transactions on Knowledge and Data Engineering
k-ANMI: A mutual information based clustering algorithm for categorical data

Information Fusion
Semantic clustering of XML documents

ACM Transactions on Information Systems (TOIS)
XML data clustering: An overview

ACM Computing Surveys (CSUR)
Isolating top-k dense regions with filtration of sparse background

Pattern Recognition Letters
MAR: Maximum Attribute Relative of soft set for clustering attribute selection

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a partitioning method capable to manage transactions, namelyt uples of variable size of categorical data. We adapt the standard definition of mathematical distance used in the K- Means algorithm to represent dissimilarityam ong transactions, and redefine the notion of cluster centroid. The cluster centroid is used as the representative of the common properties of cluster elements. We show that using our concept of cluster centroid together with Jaccard distance we obtain results that are comparable in qualityw ith the most used transactional clustering approaches, but substantial improve their efficiency.