Approximate kernel clustering

From MaRDI portal




Abstract: In the kernel clustering problem we are given a large nimesn positive semi-definite matrix A=(aij) with sumi,j=1naij=0 and a small kimesk positive semi-definite matrix B=(bij). The goal is to find a partition S1,...,Sk of 1,...n which maximizes the quantity sum_{i,j=1}^k (sum_{(i,j)in S_i imes S_j}a_{ij})b_{ij}. We study the computational complexity of this generic clustering problem which originates in the theory of machine learning. We design a constant factor polynomial time approximation algorithm for this problem, answering a question posed by Song, Smola, Gretton and Borgwardt. In some cases we manage to compute the sharp approximation threshold for this problem assuming the Unique Games Conjecture (UGC). In particular, when B is the 3imes3 identity matrix the UGC hardness threshold of this problem is exactly frac16pi27. We present and study a geometric conjecture of independent interest which we show would imply that the UGC threshold when B is the kimesk identity matrix is frac8pi9(1frac1k) for every kge3.











This page was built for publication: Approximate kernel clustering

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3400770)