Clustering categorical data using coverage density

Authors:
Hua Yan;Lei Zhang;Yi Zhang
Affiliations:
Computational Intelligence Laboratory, School of Computer Science and Engineering, University of Electronic Science & Technology of China, Chengdu, P.R. China;Computational Intelligence Laboratory, School of Computer Science and Engineering, University of Electronic Science & Technology of China, Chengdu, P.R. China;Computational Intelligence Laboratory, School of Computer Science and Engineering, University of Electronic Science & Technology of China, Chengdu, P.R. China
Venue:
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Year:
2005

Citing 8
Cited 2

CACTUS—clustering categorical data using summaries

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering transactions using large items

Proceedings of the eighth international conference on Information and knowledge management
Data mining: concepts and techniques

Data mining: concepts and techniques
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values

Data Mining and Knowledge Discovery
Finding Localized Associations in Market Basket Data

IEEE Transactions on Knowledge and Data Engineering
Clustering Categorical Data: An Approach Based on Dynamical Systems

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
CLOPE: a fast and effective clustering algorithm for transactional data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
ROCK: A Robust Clustering Algorithm for Categorical Attributes

ICDE '99 Proceedings of the 15th International Conference on Data Engineering

Efficiently clustering transactional data with weighted coverage density

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
SCALE: a scalable framework for efficiently clustering transactional data

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a new algorithm based on the idea of coverage density is proposed for clustering categorical data. It uses average coverage density as the global criterion function. Large sparse categorical databases can be clustered effectively by using this algorithm. It shows that the algorithm uses less memory and time by analyzing its time and space complexity. Experiments on two real datasets are carried out to illustrate the performance of the proposed algorithm.