Clustering based on a near neighbor graph and a grid cell graph

Authors:
Xinquan Chen
Affiliations:
School of Computer Science & Engineering, Chongqing Three Gorges University, Chongqing, China
Venue:
Journal of Intelligent Information Systems
Year:
2013

Citing 17
Cited 0

CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Nonlinear component analysis as a kernel eigenvalue problem

Neural Computation
Two-Dimensional Voronoi Diagrams in the Lp-Metric

Journal of the ACM (JACM)
Data clustering: a review

ACM Computing Surveys (CSUR)
BIRCH: A New Data Clustering Algorithm and Its Applications

Data Mining and Knowledge Discovery
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
STING: A Statistical Information Grid Approach to Spatial Data Mining

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Pattern Recognition, Third Edition

Pattern Recognition, Third Edition
Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters

IEEE Transactions on Computers
A Divide-and-Conquer Approach for Minimum Spanning Tree-Based Clustering

IEEE Transactions on Knowledge and Data Engineering
Introduction to Algorithms, Third Edition

Introduction to Algorithms, Third Edition
A graph-theoretical clustering method based on two rounds of minimum spanning trees

Pattern Recognition
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
Survey: Graph clustering

Computer Science Review
Clustering interval data through kernel-induced feature space

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents two novel graph-clustering algorithms, Clustering based on a Near Neighbor Graph (CNNG) and Clustering based on a Grid Cell Graph (CGCG). CNNG algorithm inspired by the idea of near neighbors is an improved graph-clustering method based on Minimum Spanning Tree (MST). In order to analyze massive data sets more efficiently, CGCG algorithm, which is a kind of graph-clustering method based on MST on the level of grid cells, is presented. To clearly describe the two algorithms, we give some important concepts, such as near neighbor point set, near neighbor undirected graph, grid cell, and so on. To effectively implement the two algorithms, we use some efficient partitioning and index methods, such as multidimensional grid partition method, multidimensional index tree, and so on. From simulation experiments of some artificial data sets and seven real data sets, we observe that the time cost of CNNG algorithm can be decreased by using some improving techniques and approximate methods while attaining an acceptable clustering quality, and CGCG algorithm can approximately analyze some dense data sets with linear time cost. Moreover, comparing some classical clustering algorithms, CNNG algorithm can often get better clustering quality or quicker clustering speed.