The use of a supervised k-means algorithm on real-valued data with applications in health

Authors:
Sami H. Al-Harbi;Vic J. Rayward-Smith
Affiliations:
School of Information Systems, University of East Anglia, Norwich, United Kingdom;School of Information Systems, University of East Anglia, Norwich, United Kingdom
Venue:
IEA/AIE'2003 Proceedings of the 16th international conference on Developments in applied artificial intelligence
Year:
2003

Citing 1
Cited 3

Clustering Algorithms

Clustering Algorithms

Median Topographic Maps for Biomedical Data Sets

Similarity-Based Clustering
Study of Principal Components on Classification of Problematic Wine Fermentations

ICDM '09 Proceedings of the 9th Industrial Conference on Advances in Data Mining. Applications and Theoretical Aspects
Evolutionary optimization of regression model ensembles in steel-making process

IDEAL'11 Proceedings of the 12th international conference on Intelligent data engineering and automated learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

k-means is traditionally viewed as an unsupervised algorithm for the clustering of a heterogeneous population into a number of more homogeneous groups of objects. However, it is not necessarily guaranteed to group the same types (classes) of objects together. In such cases, some supervision is needed to partition objects which have the same class label into one cluster. This paper demonstrates how the popular k-means clustering algorithm can be profitably modified to be used as a classifier algorithm. The output field itself cannot be used in the clustering but it is used in developing a suitable metric defined on other fields. The proposed algorithm combines Simulated Annealing and the modified k-means algorithm. We also apply the proposed algorithm to real data sets, which result in improvements in confidence when compared to C4.5.