PHD: an efficient data clustering scheme using partition space technique for knowledge discovery in large databases

  • Authors:
  • Cheng-Fa Tsai;Heng-Fu Yeh;Jui-Fang Chang;Ning-Han Liu

  • Affiliations:
  • Department of Management Information Systems, National Pingtung University of Science and Technology, Pingtung, Taiwan 91201;Department of Management Information Systems, National Pingtung University of Science and Technology, Pingtung, Taiwan 91201;Department of International Business, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan 80778;Department of Management Information Systems, National Pingtung University of Science and Technology, Pingtung, Taiwan 91201

  • Venue:
  • Applied Intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Rapid technological advances imply that the amount of data stored in databases is rising very fast. However, data mining can discover helpful implicit information in large databases. How to detect the implicit and useful information with lower time cost, high correctness, high noise filtering rate and fit for large databases is of priority concern in data mining, specifying why considerable clustering schemes have been proposed in recent decades. This investigation presents a new data clustering approach called PHD, which is an enhanced version of KIDBSCAN. PHD is a hybrid density-based algorithm, which partitions the data set by K-means, and then clusters the resulting partitions with IDBSCAN. Finally, the closest pairs of clusters are merged until the natural number of clusters of data set is reached. Experimental results reveal that the proposed algorithm can perform the entire clustering, and efficiently reduce the run-time cost. They also indicate that the proposed new clustering algorithm conducts better than several existing well-known schemes such as the K-means, DBSCAN, IDBSCAN and KIDBSCAN algorithms. Consequently, the proposed PHD algorithm is efficient and effective for data clustering in large databases.