Large-scale uncertainty management systems: learning and exploiting your data

Authors:
Shivnath Babu;Sudipto Guha;Kamesh Munagala
Affiliations:
Duke University, Durham, NC, USA;University of Pennsylvania, Philadelphia, PA, USA;Duke University, Durham, NC, USA
Venue:
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Year:
2009

Citing 29
Cited 0

The weighted majority algorithm

Information and Computation
Eddies: continuously adaptive query processing

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Allocating Bandwidth for Bursty Connections

SIAM Journal on Computing
Introduction to Stochastic Search and Optimization

Introduction to Stochastic Search and Optimization
Convex Optimization

Convex Optimization
Adaptive routing with end-to-end feedback: distributed learning and geometric approaches

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Adaptive ordering of pipelined stream filters

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Operator placement for in-network stream query processing

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Towards a robust query optimizer: a principled and practical approach

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Online convex optimization in the bandit setting: gradient descent without a gradient

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Near-optimal sensor placements: maximizing information while minimizing communication cost

Proceedings of the 5th international conference on Information processing in sensor networks
Asking the right questions: model-driven optimization using probes

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Query optimization over web services

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Creating probabilistic databases from information extraction models

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Fast Algorithms for Logconcave Functions: Sampling, Rounding, Integration and Optimization

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Approximation algorithms for budgeted learning problems

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Model-driven optimization using adaptive probes

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Model-driven data acquisition in sensor networks

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Processing forecasting queries

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Suppression and failures in sensor networks: a Bayesian approach

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Probabilistic graphical models and their role in databases

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A plant location guide for the unsure

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Approximation algorithms for clustering uncertain data

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Bypass rates: reducing query abandonment using negative inferences

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Fa: A System for Automating Failure Diagnosis

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Exceeding expectations and clustering uncertain data

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Automated experiment-driven management of (database) systems

HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
The pipelined set cover problem

ICDT'05 Proceedings of the 10th international conference on Database Theory
Active and accelerated learning of cost models for optimizing scientific applications

VLDB '06 Proceedings of the 32nd international conference on Very large data bases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The database community has made rapid strides in capturing, representing, and querying uncertain data. Probabilistic databases capture the inherent uncertainty in derived tuples as probability estimates. Data acquisition and stream systems can produce succinct summaries of very large and time-varying datasets. This tutorial addresses the natural next step in harnessing uncertain data: How can we efficiently and quantifiably determine what, how, and how much to learn in order to make good decisions based on the imprecise information available. The material in this tutorial is drawn from a range of fields including database systems, control and information theory, operations research, convex optimization, and statistical learning. The focus of the tutorial is on the natural constraints that are imposed in a database context and the demands of imprecise information from an optimization point of view. We look both into the past as well as into the future; to discuss general tools and techniques that can serve as a guide to database researchers and practitioners, and to enumerate the challenges that lie ahead.