A unified approach to approximation algorithms for bottleneck problems
Journal of the ACM (JACM)
Toward Efficient Agnostic Learning
Machine Learning - Special issue on computational learning theory, COLT'92
MAX-CUT has a randomized approximation scheme in dense graphs
Random Structures & Algorithms
An Õ(n3/14)-coloring algorithm for 3-colorable graphs
Information Processing Letters
Property testing and its connection to learning and approximation
Journal of the ACM (JACM)
Efficient noise-tolerant learning from statistical queries
Journal of the ACM (JACM)
Polynomial time approximation schemes for dense instances of NP -hard problems
Journal of Computer and System Sciences
Clustering for edge-cost minimization (extended abstract)
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Algorithms for graph partitioning on the planted partition model
Random Structures & Algorithms
Computers and Intractability; A Guide to the Theory of NP-Completeness
Computers and Intractability; A Guide to the Theory of NP-Completeness
Testing the diameter of graphs
Random Structures & Algorithms
Improved Algorithms for the Random Cluster Graph Model
SWAT '02 Proceedings of the 8th Scandinavian Workshop on Algorithm Theory
Learning to match and cluster large high-dimensional data sets for data integration
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Improved Combinatorial Algorithms for the Facility Location and k-Median Problems
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Spectral Partitioning of Random Graphs
FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science
Clustering with Qualitative Information
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Correlation Clustering: maximizing agreements via semidefinite programming
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Aggregating inconsistent information: ranking and clustering
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
On Non-Approximability for Quadratic Programs
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Supervised clustering with support vector machines
ICML '05 Proceedings of the 22nd international conference on Machine learning
Error bounds for correlation clustering
ICML '05 Proceedings of the 22nd international conference on Machine learning
Correlation clustering with a fixed number of clusters
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Clustering with qualitative information
Journal of Computer and System Sciences - Special issue: Learning theory 2003
On the approximability of maximum and minimum edge clique partition problems
CATS '06 Proceedings of the 12th Computing: The Australasian Theroy Symposium - Volume 51
Centralized and Distributed Multi-view Correspondence
International Journal of Computer Vision
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
ACM Transactions on Knowledge Discovery from Data (TKDD)
Efficient location area planning for personal communication systems
IEEE/ACM Transactions on Networking (TON)
Leveraging aggregate constraints for deduplication
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Community Mining from Signed Social Networks
IEEE Transactions on Knowledge and Data Engineering
Seeking stable clusters in the blogosphere
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Approximate clustering of incomplete fingerprints
Journal of Discrete Algorithms
An optimal sdp algorithm for max-cut, and equally optimal long code tests
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
A discriminative framework for clustering via similarity functions
STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Approximation algorithms for co-clustering
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Spectral clustering with inconsistent advice
Proceedings of the 25th international conference on Machine learning
Aggregating inconsistent information: Ranking and clustering
Journal of the ACM (JACM)
A note on the inapproximability of correlation clustering
Information Processing Letters
A General Framework for Increasing the Robustness of PCA-Based Correlation Clustering Algorithms
SSDBM '08 Proceedings of the 20th international conference on Scientific and Statistical Database Management
Clustering with Partial Information
MFCS '08 Proceedings of the 33rd international symposium on Mathematical Foundations of Computer Science
A Local-Search 2-Approximation for 2-Correlation-Clustering
ESA '08 Proceedings of the 16th annual European symposium on Algorithms
Collaborative partitioning with maximum user satisfaction
Proceedings of the 17th ACM conference on Information and knowledge management
Closest 4-leaf power is fixed-parameter tractable
Discrete Applied Mathematics
ACM Transactions on Knowledge Discovery from Data (TKDD)
A more effective linear kernelization for cluster editing
Theoretical Computer Science
On the approximability of the Maximum Agreement SubTree and Maximum Compatible Tree problems
Discrete Applied Mathematics
Ranking tournaments: Local search and a new algorithm
Journal of Experimental Algorithmics (JEA)
An online blog reading system by topic clustering and personalized ranking
ACM Transactions on Internet Technology (TOIT)
Fixed-Parameter Algorithms for Graph-Modeled Date Clustering
TAMC '09 Proceedings of the 6th Annual Conference on Theory and Applications of Models of Computation
A More Relaxed Model for Graph-Based Data Clustering: s-Plex Editing
AAIM '09 Proceedings of the 5th International Conference on Algorithmic Aspects in Information and Management
Iterative Compression for Exactly Solving NP-Hard Minimization Problems
Algorithmics of Large and Complex Networks
Correlation Clustering Revisited: The "True" Cost of Error Minimization Problems
ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Deterministic Pivoting Algorithms for Constrained Ranking and Clustering Problems
Mathematics of Operations Research
Graph-Based Data Clustering with Overlaps
COCOON '09 Proceedings of the 15th Annual International Conference on Computing and Combinatorics
An Evidence Accumulation Approach to Constrained Clustering Combination
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Get out the vote: determining support or opposition from congressional floor-debate transcripts
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Learning field compatibilities to extract database records from unstructured text
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Bounding and comparing methods for correlation clustering beyond ILP
ILP '09 Proceedings of the Workshop on Integer Linear Programming for Natural Langauge Processing
Constant ratio fixed-parameter approximation of the edge multicut problem
Information Processing Letters
Correlation clustering for crosslingual link detection
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Practical Markov logic containing first-order quantifiers with application to identity uncertainty
CHSLP '06 Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
Creating probabilistic databases from duplicated data
The VLDB Journal — The International Journal on Very Large Data Bases
Automobile, car and BMW: horizontal and hierarchical approach in social tagging systems
Proceedings of the 2nd ACM workshop on Social web search and mining
A multiple-perspective approach to constructing and aggregating Citation Semantic Link Network
Future Generation Computer Systems
Framework for evaluating clustering algorithms in duplicate detection
Proceedings of the VLDB Endowment
Editing Graphs into Disjoint Unions of Dense Clusters
ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
Clustering with partial information
Theoretical Computer Science
Structural inference of hierarchies in networks
ICML'06 Proceedings of the 2006 conference on Statistical network analysis
Clustering query refinements by user intent
Proceedings of the 19th international conference on World wide web
Inapproximability of maximum weighted edge biclique and its applications
TAMC'08 Proceedings of the 5th international conference on Theory and applications of models of computation
Improved algorithms for bicluster editing
TAMC'08 Proceedings of the 5th international conference on Theory and applications of models of computation
Probabilistic structured predictors
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
The UGC Hardness Threshold of the Lp Grothendieck Problem
Mathematics of Operations Research
Untangling the cross-lingual link structure of Wikipedia
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Improved consensus clustering via linear programming
ACSC '10 Proceedings of the Thirty-Third Australasian Conferenc on Computer Science - Volume 102
Correlation clustering with noisy input
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
A polynomial time approximation scheme for k-consensus clustering
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Record linkage with uniqueness constraints and erroneous values
Proceedings of the VLDB Endowment
Towards a more discriminative and semantic visual vocabulary
Computer Vision and Image Understanding
Alternative parameterizations for cluster editing
SOFSEM'11 Proceedings of the 37th international conference on Current trends in theory and practice of computer science
Computational Linguistics
Automatic discovery of attributes in relational databases
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Fixed-parameter tractability of multicut parameterized by the size of the cutset
Proceedings of the forty-third annual ACM symposium on Theory of computing
PLINI: a probabilistic logic program framework for inconsistent news information
Logic programming, knowledge representation, and nonmonotonic reasoning
IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
Clustering with local restrictions
ICALP'11 Proceedings of the 38th international colloquim conference on Automata, languages and programming - Volume Part I
Can everybody sit closer to their friends than their enemies?
MFCS'11 Proceedings of the 36th international conference on Mathematical foundations of computer science
The minimum transfer cost principle for model-order selection
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Improved approximation algorithms for bipartite correlation clustering
ESA'11 Proceedings of the 19th European conference on Algorithms
A 2k kernel for the cluster editing problem
Journal of Computer and System Sciences
Journal of Biomedical Informatics
A More Relaxed Model for Graph-Based Data Clustering: $s$-Plex Cluster Editing
SIAM Journal on Discrete Mathematics
Fitting Tree Metrics: Hierarchical Clustering and Phylogeny
SIAM Journal on Computing
Convergence and approximation in potential games
STACS'06 Proceedings of the 23rd Annual conference on Theoretical Aspects of Computer Science
Exploiting Web querying for Web people search
ACM Transactions on Database Systems (TODS)
Quality-aware similarity assessment for entity matching in Web data
Information Systems
On the NP-Completeness of some graph cluster measures
SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Overcoming browser cookie churn with clustering
Proceedings of the fifth ACM international conference on Web search and data mining
Correlation clustering and consensus clustering
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Fixed-parameter tractable generalizations of cluster editing
CIAC'06 Proceedings of the 6th Italian conference on Algorithms and Complexity
Information distances over clusters
ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part I
Pruning training samples using a supervised clustering algorithm
ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part II
A randomized PTAS for the minimum Consensus Clustering with a fixed number of clusters
Theoretical Computer Science
On the parameterized complexity of consensus clustering
ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Convergence and approximation in potential games
Theoretical Computer Science
Optimal partitions in additively separable hedonic games
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One
Computer Science Review
Graph-based data clustering with overlaps
Discrete Optimization
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Visualization of Global Correlation Structures in Uncertain 2D Scalar Fields
Computer Graphics Forum
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Discriminative clustering for market segmentation
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Chromatic correlation clustering
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
On editing graphs into 2-club clusters
FAW-AAIM'12 Proceedings of the 6th international Frontiers in Algorithmics, and Proceedings of the 8th international conference on Algorithmic Aspects in Information and Management
Dynamic reconfiguration in modular robots using graph partitioning-based coalitions
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Cluster editing with locally bounded modifications
Discrete Applied Mathematics
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Scalable clustering of signed networks using balance normalized cut
Proceedings of the 21st ACM international conference on Information and knowledge management
Routing state distance: a path-based metric for network analysis
Proceedings of the 2012 ACM conference on Internet measurement conference
A more effective linear kernelization for Cluster Editing
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
On the complexity of Newman's community finding approach for biological and social networks
Journal of Computer and System Sciences
Globally optimal closed-surface segmentation for connectomics
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Fast planar correlation clustering for image segmentation
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Clustering with local restrictions
Information and Computation
A machine learning approach for instance matching based on similarity metrics
ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Computing desirable partitions in additively separable hedonic games
Artificial Intelligence
Coalition structure generation over graphs
Journal of Artificial Intelligence Research
Clustering under approximation stability
Journal of the ACM (JACM)
Leveraging transitive relations for crowdsourced joins
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
A power-driven thermal sensor placement algorithm for dynamic thermal management
Proceedings of the Conference on Design, Automation and Test in Europe
Finding small separators in linear time via treewidth reduction
ACM Transactions on Algorithms (TALG)
On the approximability of maximum and minimum edge clique partition problems
CATS '06 Proceedings of the Twelfth Computing: The Australasian Theory Symposium - Volume 51
Correlation clustering with stochastic labellings
SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition
Break and conquer: efficient correlation clustering for image segmentation
SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition
Question selection for crowd entity resolution
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
We consider the following clustering problem: we have a complete graph on n vertices (items), where each edge (u, v) is labeled either + or − depending on whether u and v have been deemed to be similar or different. The goal is to produce a partition of the vertices (a clustering) that agrees as much as possible with the edge labels. That is, we want a clustering that maximizes the number of + edges within clusters, plus the number of − edges between clusters (equivalently, minimizes the number of disagreements: the number of − edges inside clusters plus the number of + edges between clusters). This formulation is motivated from a document clustering problem in which one has a pairwise similarity function f learned from past data, and the goal is to partition the current set of documents in a way that correlates with f as much as possible; it can also be viewed as a kind of “agnostic learning” problem.An interesting feature of this clustering formulation is that one does not need to specify the number of clusters k as a separate parameter, as in measures such as k-median or min-sum or min-max clustering. Instead, in our formulation, the optimal number of clusters could be any value between 1 and n, depending on the edge labels. We look at approximation algorithms for both minimizing disagreements and for maximizing agreements. For minimizing disagreements, we give a constant factor approximation. For maximizing agreements we give a PTAS, building on ideas of Goldreich, Goldwasser, and Ron (1998) and de la Veg (1996). We also show how to extend some of these results to graphs with edge labels in [−1, +1], and give some results for the case of random noise.