Sharp kernel clustering algorithms and their associated Grothendieck inequalities

  • Authors:
  • Subhash Khot;Assaf Naor

  • Affiliations:
  • NSF;NSF

  • Venue:
  • SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the kernel clustering problem we are given a (large) n x n symmetric positive semidefinite matrix A = (aij) with Σni=1 Σnj=1 aij = 0 and a (small) k x k symmetric positive semidefinite matrix B = (bij). The goal is to find a partition {S1, ..., Sk} of {1, ... n} which maximizes Σki=1 Σkj=1 (Σ(p, q)ε Si x Sj apq) bij. We design a polynomial time approximation algorithm that achieves an approximation ratio of R(B)2/C(B), where R(B) and C(B) are geometric parameters that depend only on the matrix B, defined as follows: if bij = i, vj is the Gram matrix representation of B for some v1, ..., vk ε Rk then R(B) is the minimum radius of a Euclidean ball containing the points {v1, ..., vk}. The parameter C(B) is defined as the maximum over all measurable partitions {A1, ..., Ak} of Rk-1 of the quantity Σki=1 Σkj=1 bij zi, zj, where for i ε {1, ..., k} the vector zi ε Rk-1 is the Gaussian moment of Ai, i.e., zi = 1/(2π)(k-1)/2 ∫Ai xe-||x||22/2 dx. We also show that for every ε 0, achieving an approximation guarantee of (1 - ε)R(B)2/C(B) is Unique Games hard.