Patch Relational Neural Gas --- Clustering of Huge Dissimilarity Datasets

Authors:
Alexander Hasenfuss;Barbara Hammer;Fabrice Rossi
Affiliations:
Department of Informatics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany;Department of Informatics, Clausthal University of Technology, Clausthal-Zellerfeld, Germany;Projet AxIS, INRIA Rocquencourt, Domaine de Voluceau, Rocquencourt, Le Chesnay Cedex, France 78153
Venue:
ANNPR '08 Proceedings of the 3rd IAPR workshop on Artificial Neural Networks in Pattern Recognition
Year:
2008

Citing 10
Cited 1

BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Clustering Algorithms

Clustering Algorithms
STING: A Statistical Information Grid Approach to Spatial Data Mining

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
How to make large self-organizing maps for nonvectorial data

Neural Networks - New developments in self-organizing maps
Clustering data streams

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Fast and exact out-of-core and distributed k-means clustering

Knowledge and Information Systems
Batch and median neural gas

Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
Edit distance-based kernel functions for structural pattern classification

Pattern Recognition
Relational Neural Gas

KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence

Parallelized kernel patch clustering

ANNPR'10 Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering constitutes an ubiquitous problem when dealing with huge data sets for data compression, visualization, or preprocessing. Prototype-based neural methods such as neural gas or the self-organizing map offer an intuitive and fast variant which represents data by means of typical representatives, thereby running in linear time. Recently, an extension of these methods towards relational clustering has been proposed which can handle general non-vectorial data characterized by dissimilarities only, such as alignment or general kernels. This extension, relational neural gas, is directly applicable in important domains such as bioinformatics or text clustering. However, it is quadratic in mboth in memory and in time (mbeing the number of data points). Hence, it is infeasible for huge data sets. In this contribution we introduce an approximate patch version of relational neural gas which relies on the same cost function but it dramatically reduces time and memory requirements. It offers a single pass clustering algorithm for huge data sets, running in constant space and linear time only.