Rearranging data objects for efficient and stable clustering

  • Authors:
  • Gyesung Lee;Xindong Wu;Jinho Chon

  • Affiliations:
  • Dankook University, Chonan, Chung-Nam, Korea;University of Vermont, Burlington, Vermont;Dankook University, Chonan, Chung-Nam, Korea

  • Venue:
  • Proceedings of the 2005 ACM symposium on Applied computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

When a partitional structure is derived from a data set using a data mining algorithm, it is not unusual to have a different set of outcomes when it runs with a different order of data. This problem is known as the order bias problem. To overcome this problem, the first clustering process proceeds to construct an initial partition. The partition is expected to imply the possible range in the number of final clusters. We apply center sorting to the data objects in the clusters of the partition to rearrange them in a new order. The same clustering procedure is reapplied to the newly arranged data set to build a new partition. We have developed an algorithm, REIT, that achieves both efficiency and reliability. A number of experiments have been performed to show that the algorithm helps minimize the order bias effects.