Exceeding expectations and clustering uncertain data

Authors:
Sudipto Guha;Kamesh Munagala
Affiliations:
University of Pennsylvania, Philadelphia, PA, USA;Duke University, Durham, NC, USA
Venue:
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2009

Citing 23
Cited 1

Optimal algorithms for approximate clustering

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
A constant-factor approximation algorithm for the k-median problem (extended abstract)

STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Approximation in stochastic scheduling: the power of LP-based priority policies

Journal of the ACM (JACM)
Scheduling precedence-constrained jobs with stochastic processing times on parallel machines

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Algorithms for facility location problems with outliers

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Approximation algorithms for metric facility location and k-Median problems using the primal-dual schema and Lagrangian relaxation

Journal of the ACM (JACM)
Allocating Bandwidth for Bursty Connections

SIAM Journal on Computing
Stochastic Load Balancing and Related Problems

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Improved Combinatorial Algorithms for the Facility Location and k-Median Problems

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Local Search Heuristics for k-Median and Facility Location Problems

SIAM Journal on Computing
On the costs and benefits of procrastination: approximation algorithms for stochastic combinatorial optimization problems

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Boosted sampling: approximation algorithms for stochastic optimization

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Approximating the Stochastic Knapsack Problem: The Benefit of Adaptivity

FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
An Edge in Time Saves Nine: LP Rounding Approximation Algorithms for Stochastic Network Design

FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
Stochastic Optimization is (Almost) as easy as Deterministic Optimization

FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
Adaptivity and approximation for stochastic packing problems

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Sampling-based Approximation Algorithms for Multi-stage Stochastic

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Model-driven optimization using adaptive probes

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A plant location guide for the unsure

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Approximation algorithms for clustering uncertain data

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
How to probe for an extreme value

ACM Transactions on Algorithms (TALG)
Stochastic steiner trees without a root

ICALP'05 Proceedings of the 32nd international conference on Automata, Languages and Programming
What about wednesday? approximation algorithms for multistage stochastic optimization

APPROX'05/RANDOM'05 Proceedings of the 8th international workshop on Approximation, Randomization and Combinatorial Optimization Problems, and Proceedings of the 9th international conference on Randamization and Computation: algorithms and techniques

Large-scale uncertainty management systems: learning and exploiting your data

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Database technology is playing an increasingly important role in understanding and solving large-scale and complex scientific and societal problems and phenomena, for instance, understanding biological networks, climate modeling, electronic markets, etc. In these settings, uncertainty or imprecise information is a pervasive issue that becomes a serious impediment to understanding and effectively utilizing such systems. Clustering is one of the key problems in this context. In this paper we focus on the problem of clustering, specifically the k-center problem. Since the problem is NP-Hard in deterministic setting, a natural avenue is to consider approximation algorithms with a bounded performance ratio. In an earlier paper Cormode and McGregor had considered certain variants of this problem, but failed to provide approximations that preserved the number of centers. In this paper we remedy the situation and provide true approximation algorithms for a wider class of these problems. However, the key aspect of this paper is to devise general techniques for optimization under uncertainty. We show that a particular formulation which uses the contribution of a random variable above its expectation is useful in this context. We believe these techniques will find wider applications in optimization under uncertainty.