Incomplete Information in Relational Databases
Journal of the ACM (JACM)
A probabilistic relational algebra for the integration of information retrieval and database systems
ACM Transactions on Information Systems (TOIS)
Probabilistic Networks and Expert Systems
Probabilistic Networks and Expert Systems
The Management of Probabilistic Data
IEEE Transactions on Knowledge and Data Engineering
The Theory of Probabilistic Databases
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Learning Probabilistic Relational Models
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Probabilistic reasoning for complex systems
Probabilistic reasoning for complex systems
MauveDB: supporting model-based user views in database systems
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Creating probabilistic databases from information extraction models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Fast and Simple Relational Processing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
First-order probabilistic inference
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Discriminative probabilistic models for relational data
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Uncertainty management in rule-based information extraction systems
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
E = MC3: managing uncertain enterprise data in a cluster-computing environment
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Indexing correlated probabilistic databases
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Representing uncertain data: models, properties, and algorithms
The VLDB Journal — The International Journal on Very Large Data Bases
Do you know your IQ?: a research agenda for information quality in systems
ACM SIGMETRICS Performance Evaluation Review
PODS: a new model and processing algorithms for uncertain data streams
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Consistent query answers in inconsistent probabilistic databases
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Threshold query optimization for uncertain data
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
A generic framework for handling uncertain data with local correlations
Proceedings of the VLDB Endowment
Set similarity join on probabilistic data
Proceedings of the VLDB Endowment
Scalable probabilistic databases with factor graphs and MCMC
Proceedings of the VLDB Endowment
A*-tree: a structure for storage and modeling of uncertain multidimensional arrays
Proceedings of the VLDB Endowment
k-nearest neighbors in uncertain graphs
Proceedings of the VLDB Endowment
Querying probabilistic information extraction
Proceedings of the VLDB Endowment
Probabilistic inverse ranking queries in uncertain databases
The VLDB Journal — The International Journal on Very Large Data Bases
Tractability in probabilistic databases
Proceedings of the 14th International Conference on Database Theory
Tuffy: scaling up statistical inference in Markov logic networks using an RDBMS
Proceedings of the VLDB Endowment
Efficient query answering in probabilistic RDF graphs
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Hybrid in-database inference for declarative information extraction
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Latent OLAP: data cubes over latent variables
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Cleaning uncertain streams for query improvement
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
The monte carlo database system: Stochastic analysis close to the data
ACM Transactions on Database Systems (TODS)
Database foundations for scalable RDF processing
RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Cost-efficient repair in inconsistent probabilistic databases
Proceedings of the 20th ACM international conference on Information and knowledge management
Interactive reasoning in uncertain RDF knowledge bases
Proceedings of the 20th ACM international conference on Information and knowledge management
Probabilistic management of OCR data using an RDBMS
Proceedings of the VLDB Endowment
Towards a unified architecture for in-RDBMS analytics
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Local structure and determinism in probabilistic databases
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
CLARO: modeling and processing uncertain data streams
The VLDB Journal — The International Journal on Very Large Data Bases
AKBC-WEKEX '12 Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction
MADden: query-driven statistical text analytics
Proceedings of the 21st ACM international conference on Information and knowledge management
Towards high-throughput gibbs sampling at scale: a study across storage managers
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Beyond myopic inference in big data pipelines
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Causality and responsibility: probabilistic queries revisited in uncertain databases
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Tabular: a schema-driven probabilistic programming language
Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
Approximation trade-offs in a Markovian stream warehouse: An empirical study
Information Systems
Hi-index | 0.00 |
Several real-world applications need to effectively manage and reason about large amounts of data that are inherently uncertain. For instance, pervasive computing applications must constantly reason about volumes of noisy sensory readings for a variety of reasons, including motion prediction and human behavior modeling. Such probabilistic data analyses require sophisticated machine-learning tools that can effectively model the complex spatio/temporal correlation patterns present in uncertain sensory data. Unfortunately, to date, most existing approaches to probabilistic database systems have relied on somewhat simplistic models of uncertainty that can be easily mapped onto existing relational architectures: Probabilistic information is typically associated with individual data tuples, with only limited or no support for effectively capturing and reasoning about complex data correlations. In this paper, we introduce BayesStore, a novel probabilistic data management architecture built on the principle of handling statistical models and probabilistic inference tools as first-class citizens of the database system. Adopting a machine-learning view, BAYESSTORE employs concise statistical relational models to effectively encode the correlation patterns between uncertain data, and promotes probabilistic inference and statistical model manipulation as part of the standard DBMS operator repertoire to support efficient and sound query processing. We present BAYESSTORE's uncertainty model based on a novel, first-order statistical model, and we redefine traditional query processing operators, to manipulate the data and the probabilistic models of the database in an efficient manner. Finally, we validate our approach, by demonstrating the value of exploiting data correlations during query processing, and by evaluating a number of optimizations which significantly accelerate query processing.