Adapting k-means for supervised clustering

Authors:
S. H. Al-Harbi;V. J. Rayward-Smith
Affiliations:
Information Center, Riyadh, Saudi Arabia 11485;School of Computing Sciences, University of East Anglia, Norwich, England NR4 7TJ
Venue:
Applied Intelligence
Year:
2006

Citing 8
Cited 6

An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Randomizing Outputs to Increase Prediction Accuracy

Machine Learning
Clustering Algorithms

Clustering Algorithms
Building Data Mining Applications for CRM

Building Data Mining Applications for CRM
Data Mining Techniques: For Marketing, Sales, and Customer Support

Data Mining Techniques: For Marketing, Sales, and Customer Support
Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Improved use of continuous attributes in C4.5

Journal of Artificial Intelligence Research

Clustering: A neural network approach

Neural Networks
Performance prediction methodology based on pattern recognition

Signal Processing
Combining instance selection methods based on data characterization: An approach to increase their effectiveness

Information Sciences: an International Journal
A two-leveled symbiotic evolutionary algorithm for clustering problems

Applied Intelligence
Missing data analyses: a hybrid multiple imputation algorithm using Gray System Theory and entropy based on clustering

Applied Intelligence
Sentiment analysis based on clustering: a framework in improving accuracy and recognizing neutral opinions

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

k-means is traditionally viewed as an algorithm for the unsupervised clustering of a heterogeneous population into a number of more homogeneous groups of objects. However, it is not necessarily guaranteed to group the same types (classes) of objects together. In such cases, some supervision is needed to partition objects which have the same label into one cluster. This paper demonstrates how the popular k-means clustering algorithm can be profitably modified to be used as a classifier algorithm. The output field itself cannot be used in the clustering but it is used in developing a suitable metric defined on other fields. The proposed algorithm combines Simulated Annealing with the modified k-means algorithm. We apply the proposed algorithm to real data sets, and compare the output of the resultant classifier to that of C4.5.