An Alternative to Center-Based Clustering Algorithm Via Statistical Learning Analysis

Authors:
Rui Nian;Guangrong Ji;Michel Verleysen
Affiliations:
Machine Learning Group, DICE, Université catholique de Louvain, Louvain-la-Neuve, Belgium 3-B-1348 and College of Information Science and Engineering, Ocean University of China, Qingdao, Chin ...;College of Information Science and Engineering, Ocean University of China, Qingdao, China 266003;Machine Learning Group, DICE, Université catholique de Louvain, Louvain-la-Neuve, Belgium 3-B-1348
Venue:
ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Artificial Intelligence
Year:
2008

Citing 3
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Comparison of the performance of center-based clustering algorithms

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
The mahalanobis distance based rival penalized competitive learning algorithm

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an alternative for center-based clustering algorithms, in particular the k-means algorithm, via statistical learning analysis. The essence of statistical learning principle, i.e., both the empirical risk and structural assessment, is taken into particular consideration for the clustering algorithm so as to derive and develop the relevant minimization mathematical criterion with automatic parameter learning and model selection in parallel. The proposed algorithm roughly decides on the number of clusters, by earning activation for the winners and assigning penalty for the rivals, so that the most competitive center wins for possible prediction and the extra ones are driven far away when starting the algorithm from a too large number of clusters without any prior knowledge. Simulation experiments prove the feasibility of the algorithm and show good performances of the double learning tasks during clustering.