Improving k-means by outlier removal

  • Authors:
  • Ville Hautamäki;Svetlana Cherednichenko;Ismo Kärkkäinen;Tomi Kinnunen;Pasi Fränti

  • Affiliations:
  • Speech and Image Processing Unit, Department of Computer Science, University of Joensuu, Joensuu, Finland;Speech and Image Processing Unit, Department of Computer Science, University of Joensuu, Joensuu, Finland;Speech and Image Processing Unit, Department of Computer Science, University of Joensuu, Joensuu, Finland;Speech and Image Processing Unit, Department of Computer Science, University of Joensuu, Joensuu, Finland;Speech and Image Processing Unit, Department of Computer Science, University of Joensuu, Joensuu, Finland

  • Venue:
  • SCIA'05 Proceedings of the 14th Scandinavian conference on Image Analysis
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an Outlier Removal Clustering (ORC) algorithm that provides outlier detection and data clustering simultaneously. The method employs both clustering and outlier discovery to improve estimation of the centroids of the generative distribution. The proposed algorithm consists of two stages. The first stage consist of purely K-means process, while the second stage iteratively removes the vectors which are far from their cluster centroids. We provide experimental results on three different synthetic datasets and three map images which were corrupted by lossy compression. The results indicate that the proposed method has a lower error on datasets with overlapping clusters than the competing methods.