Algorithms for clustering data
Algorithms for clustering data
IEEE Transactions on Pattern Analysis and Machine Intelligence
Word sense disambiguation for free-text indexing using a massive semantic network
CIKM '93 Proceedings of the second international conference on Information and knowledge management
CYC: a large-scale investment in knowledge infrastructure
Communications of the ACM
Towards a standard upper ontology
Proceedings of the international conference on Formal Ontology in Information Systems - Volume 2001
Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
Data Mining and Knowledge Discovery
The new k-windows algorithm for improving the k-means clustering algorithm
Journal of Complexity
Knowledge Acquisition Via Incremental Conceptual Clustering
Machine Learning
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Data quality and data cleaning: an overview
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Ontology mapping: the state of the art
The Knowledge Engineering Review
Verbs semantics and lexical selection
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
K-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality
IEEE Transactions on Pattern Analysis and Machine Intelligence
Nearest neighbor pattern classification
IEEE Transactions on Information Theory
Adjusting the clustering results referencing an external set
ICSI'10 Proceedings of the First international conference on Advances in Swarm Intelligence - Volume Part II
A parameter-free barebones particle swarm algorithm for unsupervised pattern classification
International Journal of Hybrid Intelligent Systems
Hi-index | 0.01 |
Clustering consists in partitioning a set of objects into disjoint and homogeneous clusters. For many years, clustering methods have been applied in a wide variety of disciplines and they also have been utilized in many scientific areas. Traditionally, clustering methods deal with numerical data, i.e. objects represented by a conjunction of numerical attribute values. However, nowadays commercial or scientific databases usually contain categorical data, i.e. objects represented by categorical attributes. In this paper we present a dissimilarity measure which is capable to deal with tree structured categorical data. Thus, it can be used for extending the various versions of the very popular k-means clustering algorithm to deal with such data. We discuss how such an extension can be achieved. Moreover, we empirically prove that the proposed dissimilarity measure is accurate, compared to other well-known (dis)similarity measures for categorical data.