Heuristic Methods for Large Centroid Clustering Problems

  • Authors:
  • Éric D. Taillard

  • Affiliations:
  • EIVD, University of Applied Sciences of Western Switzerland, Route de Cheseaux 1, CasePostale, CH1401 Yverdon-les-Bains, Switzerland. Eric.Taillard@eivd.ch http://www.ina.eivd.ch/taillard

  • Venue:
  • Journal of Heuristics
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article presents new heuristic methods for solving a class of hard centroid clustering problems including the p-median, the sum-of-squares clustering and the multi-source Weber problems. Centroid clustering is to partition a set of entities into a given number of subsets and to find the location of a centre for each subset in such a way that a dissimilarity measure between the entities and the centres is minimized. The first method proposed is a candidate list search that produces good solutions in a short amount of time if the number of centres in the problem is not too large. The second method is a general local optimization approach that finds very good solutions. The third method is designed for problems with a large number of centres; it decomposes the problem into subproblems that are solved independently. Numerical results show that these methods are efficient—dozens of best solutions known to problem instances of the literature have been improved—and fast, handling problem instances with more than 85,000 entities and 15,000 centres—much larger than those solved in the literature. The expected complexity of these new procedures is discussed and shown to be comparable to that of an existing method which is known to be very fast.