Capturing continuous data and answering aggregate queries in probabilistic XML

Authors:
Serge Abiteboul;T.-H. HUBERT Chan;Evgeny Kharlamov;Werner Nutt;Pierre Senellart
Affiliations:
INRIA Saclay -- Î/le-de-France & LSV, ENS Cachan, Orsay Cedex, France;The University of Hong Kong, Pokfulam Road, Hong Kong;Free University of Bozen-Bolzano & INRIA Saclay -- Î/Î/le-de-France, Bolzano, Italy;Free University of Bozen-Bolzano, Bolzano, Italy;Institut Té/lé/com/ Té/lé/com ParisTech/ CNRS LTCI, Paris, France
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2011

Citing 30
Cited 1

Incomplete Information in Relational Databases

Journal of the ACM (JACM)
The complexity of query reliability

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Foundations of Databases: The Logical Level

Foundations of Databases: The Logical Level
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Aggregate Queries Over Conditional Tables

Journal of Intelligent Information Systems
The Management of Probabilistic Data

IEEE Transactions on Knowledge and Data Engineering
Evaluating probabilistic queries over imprecise data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A Probabilistic XML Approach to Data Integration

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Rewriting queries with arbitrary aggregation functions using views

ACM Transactions on Database Systems (TODS)
Management of probabilistic data: foundations and challenges

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On the complexity of managing probabilistic XML data

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Probabilistic interval XML

ACM Transactions on Computational Logic (TOCL)
Efficient aggregation algorithms for probabilistic data

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
ProTDB: probabilistic data in XML

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Model-driven data acquisition in sensor networks

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Matching twigs in probabilistic XML

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Query efficiency in probabilistic XML models

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Incorporating constraints in probabilistic XML

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Answering aggregate queries in data exchange

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Aggregate queries over ontologies

Proceedings of the 2nd international workshop on Ontologies and information systems for the semantic web
Probabilistic databases: diamonds in the dirt

Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
Running tree automata on probabilistic XML

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On the expressiveness of probabilistic XML models

The VLDB Journal — The International Journal on Very Large Data Bases
Query evaluation over probabilistic XML

The VLDB Journal — The International Journal on Very Large Data Bases
Efficient evaluation of HAVING queries on a probabilistic database

DBPL'07 Proceedings of the 11th international conference on Database programming languages
Aggregate queries for discrete and continuous probabilistic XML

Proceedings of the 13th International Conference on Database Theory
Probabilistic XML via Markov Chains

Proceedings of the VLDB Endowment
Value joins are expensive over (probabilistic) XML

Proceedings of the 4th International Workshop on Logic in Databases
Querying and updating probabilistic information in XML

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Models for incomplete and probabilistic information

EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology

On the connections between relational and XML probabilistic data models

BNCOD'13 Proceedings of the 29th British National conference on Big Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sources of data uncertainty and imprecision are numerous. A way to handle this uncertainty is to associate probabilistic annotations to data. Many such probabilistic database models have been proposed, both in the relational and in the semi-structured setting. The latter is particularly well adapted to the management of uncertain data coming from a variety of automatic processes. An important problem, in the context of probabilistic XML databases, is that of answering aggregate queries (count, sum, avg, etc.), which has received limited attention so far. In a model unifying the various (discrete) semi-structured probabilistic models studied up to now, we present algorithms to compute the distribution of the aggregation values (exploiting some regularity properties of the aggregate functions) and probabilistic moments (especially expectation and variance) of this distribution. We also prove the intractability of some of these problems and investigate approximation techniques. We finally extend the discrete model to a continuous one, in order to take into account continuous data values, such as measurements from sensor networks, and extend our algorithms and complexity results to the continuous case.