A unified approach to approximation algorithms for bottleneck problems
Journal of the ACM (JACM)
Optimal algorithms for approximate clustering
STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
A linear-time algorithm for drawing a planar graph on a grid
Information Processing Letters
Approximation algorithms for NP-hard problems
Approximation algorithms for NP-hard problems
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Free Bits, PCPs, and Nonapproximability---Towards Tight Results
SIAM Journal on Computing
Finding similar regions in many strings
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
An O(log*n) approximation algorithm for the asymmetric p-center problem
Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
Distinguishing string selection problems
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Efficient approximation algorithms for the Hamming center problem
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
On the complexity of integer programming
Journal of the ACM (JACM)
Discrete Mathematical Structures
Discrete Mathematical Structures
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
We study Hamming versions of two classical clustering problems. The Hamming radius p-clustering problem (HRC) for a set S of k binary strings, each of length n, is to find p binary strings of length n that minimize the maximum Hamming distance between a string in S and the closest of the p strings; this minimum value is termed the p-radius of S and is denoted by ϱ The related Hamming diameter p-clustering problem (HDC) is to split S into p groups so that the maximum of the Hamming group diameters is minimized; this latter value is called the p-diameter of S. First, we provide an integer programming formulation of HRC which yields exact solutions in polynomial time whenever k and p are constant. We also observe that HDC admits straightforward polynomial-time solutions when k = O(log n) or p = 2. Next, by reduction from the corresponding geometric p-clustering problems in the plane under the L1 metric, we show that neither HRC nor HDC can be approximated within any constant factor smaller than two unless P=NP. We also prove that for any Ɛ 0 it is NP-hard to split S into at most pk1/7-Ɛ clusters whose Hamming diameter doesn't exceed the p-diameter. Furthermore, we note that by adapting Gonzalez' farthest-point clustering algorithm [6], HRC and HDC can be approximated within a factor of two in time O(pkn). Next, we describe a 2O(pϱ/Ɛ)kO(p/Ɛ)n2-time (1+ Ɛ)- approximation algorithm for HRC. In particular, it runs in polynomial time when p = O(1) and ϱ = O(log(k+n)): Finally, we show how to find in O((n/Ɛ + kn log n + k2 log n)(2ϱk)2/Ɛ) time a set L of O(p log k) strings of length n such that for each string in S there is at least one string in L within distance (1 + Ɛ)ϱ, for any constant 0