Approximate Kernel Clustering

Authors:
Subhash Khot;Assaf Naor
Affiliations:
-;-
Venue:
FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
Year:
2008

Citing 0
Cited 4

Sharp kernel clustering algorithms and their associated Grothendieck inequalities

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
The positive semidefinite Grothendieck problem with rank constraint

ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Algorithms and hardness for subspace approximation

Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Solution of the propeller conjecture in R3

STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the kernel clustering problem we are given a large $n\times n$positive semi-definite matrix $A=(a_{ij})$ with$\sum_{i,j=1}^na_{ij}=0$ and a small $k\times k$ positivesemi-definite matrix $B=(b_{ij})$. The goal is to find a partition$S_1,\ldots,S_k$ of $\{1,\ldots n\}$ which maximizes the quantity$$\sum_{i,j=1}^k \left(\sum_{(i,j)\in S_i\timesS_j}a_{ij}\right)b_{ij}.$$We study the computational complexity of this generic clusteringproblem which originates in the theory of machine learning. Wedesign a constant factor polynomial time approximation algorithm forthis problem, answering a question posed by Song, Smola, Gretton andBorgwardt. In some cases we manage to compute the sharpapproximation threshold for this problem assuming the Unique GamesConjecture (UGC). In particular, when $B$ is the $3\times 3$identity matrix the UGC hardness threshold of this problem isexactly $\frac{16\pi}{27}$. We present and study a geometricconjecture of independent interest which we show would imply thatthe UGC threshold when $B$ is the $k\times k$ identity matrix is$\frac{8\pi}{9}\left(1-\frac{1}{k}\right)$ for every $k\ge 3$.