BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Clustering Algorithms
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
How to make large self-organizing maps for nonvectorial data
Neural Networks - New developments in self-organizing maps
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Fast and exact out-of-core and distributed k-means clustering
Knowledge and Information Systems
Neural Networks - 2006 Special issue: Advances in self-organizing maps--WSOM'05
Edit distance-based kernel functions for structural pattern classification
Pattern Recognition
KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Parallelized kernel patch clustering
ANNPR'10 Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition
Hi-index | 0.00 |
Clustering constitutes an ubiquitous problem when dealing with huge data sets for data compression, visualization, or preprocessing. Prototype-based neural methods such as neural gas or the self-organizing map offer an intuitive and fast variant which represents data by means of typical representatives, thereby running in linear time. Recently, an extension of these methods towards relational clustering has been proposed which can handle general non-vectorial data characterized by dissimilarities only, such as alignment or general kernels. This extension, relational neural gas, is directly applicable in important domains such as bioinformatics or text clustering. However, it is quadratic in mboth in memory and in time (mbeing the number of data points). Hence, it is infeasible for huge data sets. In this contribution we introduce an approximate patch version of relational neural gas which relies on the same cost function but it dramatically reduces time and memory requirements. It offers a single pass clustering algorithm for huge data sets, running in constant space and linear time only.