Sharp kernel clustering algorithms and their associated Grothendieck inequalities
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
The positive semidefinite Grothendieck problem with rank constraint
ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Algorithms and hardness for subspace approximation
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Solution of the propeller conjecture in R3
STOC '12 Proceedings of the forty-fourth annual ACM symposium on Theory of computing
Hi-index | 0.00 |
In the kernel clustering problem we are given a large $n\times n$positive semi-definite matrix $A=(a_{ij})$ with$\sum_{i,j=1}^na_{ij}=0$ and a small $k\times k$ positivesemi-definite matrix $B=(b_{ij})$. The goal is to find a partition$S_1,\ldots,S_k$ of $\{1,\ldots n\}$ which maximizes the quantity$$\sum_{i,j=1}^k \left(\sum_{(i,j)\in S_i\timesS_j}a_{ij}\right)b_{ij}.$$We study the computational complexity of this generic clusteringproblem which originates in the theory of machine learning. Wedesign a constant factor polynomial time approximation algorithm forthis problem, answering a question posed by Song, Smola, Gretton andBorgwardt. In some cases we manage to compute the sharpapproximation threshold for this problem assuming the Unique GamesConjecture (UGC). In particular, when $B$ is the $3\times 3$identity matrix the UGC hardness threshold of this problem isexactly $\frac{16\pi}{27}$. We present and study a geometricconjecture of independent interest which we show would imply thatthe UGC threshold when $B$ is the $k\times k$ identity matrix is$\frac{8\pi}{9}\left(1-\frac{1}{k}\right)$ for every $k\ge 3$.