CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Nonlinear component analysis as a kernel eigenvalue problem
Neural Computation
Two-Dimensional Voronoi Diagrams in the Lp-Metric
Journal of the ACM (JACM)
ACM Computing Surveys (CSUR)
BIRCH: A New Data Clustering Algorithm and Its Applications
Data Mining and Knowledge Discovery
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Pattern Recognition, Third Edition
Pattern Recognition, Third Edition
Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters
IEEE Transactions on Computers
A Divide-and-Conquer Approach for Minimum Spanning Tree-Based Clustering
IEEE Transactions on Knowledge and Data Engineering
Introduction to Algorithms, Third Edition
Introduction to Algorithms, Third Edition
Data clustering: 50 years beyond K-means
Pattern Recognition Letters
Computer Science Review
Clustering interval data through kernel-induced feature space
Journal of Intelligent Information Systems
Hi-index | 0.00 |
This paper presents two novel graph-clustering algorithms, Clustering based on a Near Neighbor Graph (CNNG) and Clustering based on a Grid Cell Graph (CGCG). CNNG algorithm inspired by the idea of near neighbors is an improved graph-clustering method based on Minimum Spanning Tree (MST). In order to analyze massive data sets more efficiently, CGCG algorithm, which is a kind of graph-clustering method based on MST on the level of grid cells, is presented. To clearly describe the two algorithms, we give some important concepts, such as near neighbor point set, near neighbor undirected graph, grid cell, and so on. To effectively implement the two algorithms, we use some efficient partitioning and index methods, such as multidimensional grid partition method, multidimensional index tree, and so on. From simulation experiments of some artificial data sets and seven real data sets, we observe that the time cost of CNNG algorithm can be decreased by using some improving techniques and approximate methods while attaining an acceptable clustering quality, and CGCG algorithm can approximately analyze some dense data sets with linear time cost. Moreover, comparing some classical clustering algorithms, CNNG algorithm can often get better clustering quality or quicker clustering speed.