A clustering algorithm based absorbing nearest neighbors

Authors:
Jian-jun Hu;Chang jie-Tang;Jing Peng;Chuan Li;Chang-an Yuan;An-long Chen
Affiliations:
School of Computer Science, Sichuan University, Chengdu, China;School of Computer Science, Sichuan University, Chengdu, China;School of Computer Science, Sichuan University, Chengdu, China;School of Computer Science, Sichuan University, Chengdu, China;School of Computer Science, Sichuan University, Chengdu, China;School of Computer Science, Sichuan University, Chengdu, China
Venue:
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Year:
2005

Citing 6
Cited 1

BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques

Data mining: concepts and techniques
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
When Is ''Nearest Neighbor'' Meaningful?

ICDT '99 Proceedings of the 7th International Conference on Database Theory

Clustering algorithm based on mutual K-nearest neighbor relationships

Statistical Analysis and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

The clustering over various granularities for high dimensional data in arbitrary shape is a challenge in data mining. In this paper Nearest Neighbors Absorbed First (NNAF) clustering algorithm is proposed to solve the problem based on the idea that the objects in the same cluster must be near. The main contribution includes: (1) A theorem of searching nearest neighbors (SNN) is proved. Based on it, SNN algorithms are proposed with time complexity O(n*log(n)) or O(n). They are much faster than the traditional searching nearest neighbors algorithm with O(n2). (2)The clustering algorithm of NNAF to process high dimensional data with arbitrary shape is proposed with time complexity O(n). The experiments show that the new algorithms can process efficiently high dimensional data in arbitrary shape with noisy. They can produce clustering over various granularities quickly with little domain knowledge.