Pairwise data clustering and applications

Authors:
Xiaodong Wu;Danny Z. Chen;James J. Mason;Steven R. Schmid
Affiliations:
Department of Computer Science, University of Texas-Pan American, Edinburg, TX;Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN;Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame, IN;Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame, IN
Venue:
COCOON'03 Proceedings of the 9th annual international conference on Computing and combinatorics
Year:
2003

Citing 20
Cited 1

The maximum concurrent flow problem

Journal of the ACM (JACM)
Faster Approximation Algorithms for the Unit Capacity Concurrent Flow Problem with Applications to Routing and Finding Sparse Cuts

SIAM Journal on Computing
Fast approximation algorithms for multicommodity flow problems

Selected papers of the 23rd annual ACM symposium on Theory of computing
Incremental clustering and dynamic information retrieval

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Clustering gene expression patterns

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
On the performance of spectral graph partitioning methods

Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
Clustering in large graphs and matrices

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Exact and approximation algorithms for clustering

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Fast Approximate Graph Partitioning Algorithms

SIAM Journal on Computing
Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms

Journal of the ACM (JACM)
Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Approximating min-sum k-clustering in metric spaces

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Local search heuristic for k-median and facility location problems

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Faster approximation schemes for fractional multicommodity flow problems

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Introduction to Algorithms

Introduction to Algorithms
Approximate Max-Flow Min-(Multi)Cut Theorems and Their Applications

SIAM Journal on Computing
Pairwise Data Clustering by Deterministic Annealing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Faster and Simpler Algorithms for Multicommodity Flow and other Fractional Packing Problems.

FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
On clusterings-good, bad and spectral

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Clustering data streams

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science

An Approximate Distribution for the Normalized Cut

Journal of Mathematical Imaging and Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data clustering is an important theoretical topic and a sharp tool for various applications. Its main objective is to partition a given data set into clusters such that the data within the same cluster are "more" similar to each other with respect to certain measures. In this paper, we study the pairwise data clustering problem with pairwise similarity/ dissimilarity measures that need not satisfy the triangle inequality. By using a criterion, called the minimum normalized cut, we model the pairwise data clustering problem as a graph partition problem. The graph partition problem based on minimizing the normalized cut is known to be NP-hard. We present a ((4 + o(1)) ln n)-approximation polynomial time algorithm for the minimum normalized cut problem. We also give a more efficient algorithm for this problem by sacrificing the approximation ratio slightly. Further, our scheme achieves a ((2 + o(1)) ln n)- approximation polynomial time algorithm for computing the sparsest cuts in edge-weighted and vertex-weighted undirected graphs, improving the previously best known approximation ratio by a constant factor.