A tree-structured framework for purifying "complex" clusters with structural roles of individual data

  • Authors:
  • Jundi Ding;Runing Ma;Jingyu Yang;Songcan Chen

  • Affiliations:
  • School of Computer Science and Technology, Nanjing University of Science and Technology, China and Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautic ...;Department of Science, Nanjing University of Aeronautics and Astronautics, China;School of Computer Science and Technology, Nanjing University of Science and Technology, China;Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautics, China

  • Venue:
  • Pattern Recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

How can we find a natural clustering of a ''complex'' dataset, which may contain an unknown number of overlapping clusters of arbitrary shape and be contaminated by noise? A tree-structured framework is proposed in this paper to purify such clusters by exploring the structural role of each data. In practice, each individual object within the internal organization of the data has its own specific role-''centroid'', hub or outlier-due to distinctive associations with their respective neighbors. Adjacent centroids always interact on each other and serve as mediate nodes of one tree being members of some cluster. Hubs closed to some centroid become leaf nodes responsible for the termination of the growth of trees. Outliers that weakly touch with any centroid are often discarded from any trees as global noise. All the data can thus be labeled by a specified criterion of ''centroids''-connected structural consistency (CCSC). Free of domain-specific information, our framework with CCSC could widely adapt to many clustering-related applications. Theoretical and experimental contributions both confirm that our framework is easy to interpret and implement, efficient and effective in ''complex'' clustering.