Non-crisp Clustering by Fast, Convergent, and Robust Algorithms

  • Authors:
  • Vladimir Estivill-Castro;Jianhua Yang

  • Affiliations:
  • -;-

  • Venue:
  • PKDD '01 Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We provide sub-quadratic clustering algorithms for generic dissimilarity. Our algorithms are robust because they use medians rather than means as estimators of location, and the resulting representative of a cluster is actually a data item. We demonstrate mathematically that our algorithms converge. The methods proposed generalize approaches that allow a data item to have a degree of membership in a cluster. Because our algorithm is generic to both, fuzzy membership approaches and probabilistic approaches for partial membership, we simply name it non-crisp clustering. We illustrate our algorithms with categorizing WEB visitation paths. We outperform previous clustering methods since they are all of quadratic time complexity (they essentially require computing the dissimilarity between all pairs of paths).