Efficient Data Clustering by Local Density Approximation

Authors:
Marc-Ismaël Akodjènou;Patrick Gallinari
Affiliations:
LIP6-Université Paris 6 Pierre et Marie Curie, France, email: Marc-Ismael.Akodjenou@lip6.fr;LIP6-Université Paris 6 Pierre et Marie Curie, France, email: Patrick.Gallinari@lip6.fr
Venue:
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Year:
2008

Citing 1
Cited 0

Survey of clustering algorithms

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The clustering task is a key part of the data mining process. In today's context of massive data, methods with a computational complexity more than linear are unlikely to be applied practically. In this paper, we begin by a simple assumption: local projections of the data should allow to distinguish local cluster structures. From there, we describe how to obtain “pure” local sub-groupings of points, from projections on randomly chosen lines. The clustering of the data is obtained from the clustering of these sub-groupings. Our method has a linear complexity in the dataset size, and requires only one pass on the original dataset. Being local in essence, it can handle twisted geometries typical of many high-dimensional datasets. We describe the steps of our method and report encouraging results.