Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
A Large Scale Clustering Scheme for Kernel K-Means
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 4 - Volume 4
The Journal of Machine Learning Research
KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Patch Relational Neural Gas --- Clustering of Huge Dissimilarity Datasets
ANNPR '08 Proceedings of the 3rd IAPR workshop on Artificial Neural Networks in Pattern Recognition
Semi-supervised clustering of large data sets with kernel methods
Pattern Recognition Letters
Hi-index | 0.00 |
Kernel based clustering methods allow to unsupervised partition samples in feature space but have a quadratic computation time O(n2) where n are the number of samples. Therefore these methods are generally ineligible for large datasets. In this paper we propose a meta-algorithm that performs parallelized clusterings of subsets of the samples and merges them repeatedly. The algorithm is able to use many Kernel based clustering methods where we mainly emphasize on Kernel Fuzzy C-Means and Relational Neural Gas. We show that the computation time of this algorithm is basicly linear, i.e. O(n). Further we statistically evaluate the performance of this meta-algorithm on a real-life dataset, namely the Enron Emails.