NP-hard problems in hierarchical-tree clustering
Acta Informatica
A cutting plane algorithm for a clustering problem
Mathematical Programming: Series A and B
Polynomial time approximation schemes for dense instances of NP -hard problems
Journal of Computer and System Sciences
Some APX-completeness results for cubic graphs
Theoretical Computer Science
Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties
Clustering with Qualitative Information
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Integrating Microarray Data by Consensus Clustering
ICTAI '03 Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
Correlation Clustering: maximizing agreements via semidefinite programming
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Machine Learning
Aggregating inconsistent information: ranking and clustering
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Hi-index | 0.00 |
The Correlation Clustering problem has been introduced recently [5] as a model for clustering data when a binary relationship between data points is known. More precisely, for each pair of points we have two scores measuring respectively the similarity and dissimilarity of the two points, and we would like to compute an optimal partition where the value of a partition is obtained by summing up scores of pairs involving points from a same cluster and scores of pairs involving points from different clusters. A closely related problem is Consensus Clustering, where we are given a set of partitions and we would like to obtain a partition that best summarizes the input partitions. The latter problem is a restricted case of Correlation Clustering. In this paper we prove that Min Consensus Clustering is APX-hard even for three input partitions, answering an open question, while Max Consensus Clustering admits a PTAS on instances with a bounded number of input partitions. We exhibit a combinatorial and practical ${4}\over{5}$-approximation algorithm based on a greedy technique for Max Consensus Clustering on three partitions. Moreover, we prove that a PTAS exists for Max Correlation Clustering when the maximum ratio between two scores is at most a constant.