A Global Optimization RLT-based Approach for Solving the Hard Clustering Problem

  • Authors:
  • Hanif D. Sherali;Jitamitra Desai

  • Affiliations:
  • Grado Department of Industrial and Systems Engineering (0118), Virginia Polytechnic Institute and State University, Blacksburg, U.S.A. 24061;Grado Department of Industrial and Systems Engineering (0118), Virginia Polytechnic Institute and State University, Blacksburg, U.S.A. 24061

  • Venue:
  • Journal of Global Optimization
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The field of cluster analysis is primarily concerned with the sorting of data points into different clusters so as to optimize a certain criterion. Rapid advances in technology have made it possible to address clustering problems via optimization theory. In this paper, we present a global optimization algorithm to solve the hard clustering problem, where each data point is to be assigned to exactly one cluster. The hard clustering problem is formulated as a nonlinear program, for which a tight linear programming relaxation is constructed via the Reformulation-Linearization Technique (RLT) in concert with additional valid inequalities that serve to defeat the inherent symmetry in the problem. This construct is embedded within a specialized branch-and-bound algorithm to solve the problem to global optimality. Pertinent implementation issues that can enhance the efficiency of the branch-and-bound algorithm are also discussed. Computational experience is reported using several standard data sets found in the literature as well as using synthetically generated larger problem instances. The results validate the robustness of the proposed algorithmic procedure and exhibit its dominance over the popular k-means clustering technique. Finally, a heuristic procedure to obtain a good quality solution at a relative ease of computational effort is also described.