NPUST: An Efficient Clustering Algorithm Using Partition Space Technique for Large Databases

  • Authors:
  • Cheng-Fa Tsai;Heng-Fu Yeh

  • Affiliations:
  • Department of Management Information Systems, National Pingtung University of Science and Technology, Pingtung, Taiwan 91201;Department of Management Information Systems, National Pingtung University of Science and Technology, Pingtung, Taiwan 91201

  • Venue:
  • IEA/AIE '09 Proceedings of the 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems: Next-Generation Applied Intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The rapid progress of information technology has led to increasing amounts of data produced and stored in databases. How to extract the implicit and useful information with lower time cost and high correctness is of priority concern in data mining, explaining why many clustering methods have been developed in recent decades. This work presents a new clustering algorithm named NPUST, which is an enhanced version of KIDBSCAN. NPUST is a hybrid density-based approach, which partitions the dataset using K-means, and then clusters the resulting partitions with IDBSCAN. Finally, the closest pairs of clusters are merged until the natural number of clusters of dataset is reached. Experimental results indicate that the proposed algorithm can handle the entire cluster, and efficiently lower the run-time cost. They also reveal that the proposed new clustering algorithm performs better than several existing well-known approaches such as the K-means, DBSCAN, IDBSCAN and KIDBSCAN algorithms. Consequently, the proposed NPUST algorithm is efficient and effective for data clustering.