A fast fuzzy clustering algorithm for large-scale datasets

  • Authors:
  • Lukui Shi;Pilian He

  • Affiliations:
  • Department of Computer Science and Technology, Tianjin University, Tianjin, China;Department of Computer Science and Technology, Tianjin University, Tianjin, China

  • Venue:
  • ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The transitive closure method is one of the most frequently used fuzzy clustering techniques. It has O(n3log2n) time complexity and O(n2) space complexity for matrix compositions while building transitive closures. These drawbacks limit its further applications to large-scale databases. In this paper, we proposed a fast fuzzy clustering algorithm to avoid matrix multiplications and gave a principle, where the clustering results were directly obtained from the λ-cut of the fuzzy similar relation of objects. Moreover, it was dispensable to compute and store the similar matrix of objects beforehand. The time complexity of the presented algorithm is O(n2) at most and the space complexity is O(1). Theoretical analysis and experiments demonstrate that although the new algorithm is equivalent to the transitive closure method, the former is more suitable to treat large-scale datasets because of its high computing efficiency.