Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Tree pattern query minimization
The VLDB Journal — The International Journal on Very Large Data Bases
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Containment and equivalence for a fragment of XPath
Journal of the ACM (JACM)
Rewriting XPath queries using materialized views
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Query caching and view selection for XML databases
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Rewriting nested XML queries using nested views
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
A Survey of Web Information Extraction Systems
IEEE Transactions on Knowledge and Data Engineering
On the complexity of managing probabilistic XML data
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The dichotomy of conjunctive queries on probabilistic structures
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ProTDB: probabilistic data in XML
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient mining of XML query patterns for caching
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
MARS: a system for publishing XML from mixed and redundant storage
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
A framework for using materialized XPath views in XML query processing
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Structured materialized views for XML queries
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
ACM Computing Surveys (CSUR)
On rewriting XPath queries using views
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Data integration with uncertainty
The VLDB Journal — The International Journal on Very Large Data Bases
Probabilistic databases: diamonds in the dirt
Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
Multiple Materialized View Selection for XPath Query Rewriting
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
On the expressiveness of probabilistic XML models
The VLDB Journal — The International Journal on Very Large Data Bases
Query evaluation over probabilistic XML
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient rewriting of XPath queries using Query Set Specifications
Proceedings of the VLDB Endowment
Aggregate queries for discrete and continuous probabilistic XML
Proceedings of the 13th International Conference on Database Theory
Querying XML data sources that export very large sets of views
ACM Transactions on Database Systems (TODS)
Queries and materialized views on probabilistic databases
Journal of Computer and System Sciences
Value joins are expensive over (probabilistic) XML
Proceedings of the 4th International Workshop on Logic in Databases
ProApproX: a lightweight approximation query processor over probabilistic trees
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Efficient XQuery rewriting using multiple views
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Hi-index | 0.00 |
We study the complexity of query answering using views in a probabilistic XML setting, identifying large classes of XPath queries -- with child and descendant navigation and predicates -- for which there are efficient (PTime) algorithms. We consider this problem under the two possible semantics for XML query results: with persistent node identifiers and in their absence. Accordingly, we consider rewritings that can exploit a single view, by means of compensation, and rewritings that can use multiple views, by means of intersection. Since in a probabilistic setting queries return answers with probabilities, the problem of rewriting goes beyond the classic one of retrieving XML answers from views. For both semantics of XML queries, we show that, even when XML answers can be retrieved from views, their probabilities may not be computable. For rewritings that use only compensation, we describe a PTime decision procedure, based on easily verifiable criteria that distinguish between the feasible cases -- when probabilistic XML results are computable -- and the unfeasible ones. For rewritings that can use multiple views, with compensation and intersection, we identify the most permissive conditions that make probabilistic rewriting feasible, and we describe an algorithm that is sound in general, and becomes complete under fairly permissive restrictions, running in PTime modulo worst-case exponential time equivalence tests. This is the best we can hope for since intersection makes query equivalence intractable already over deterministic data. Our algorithm runs in PTime whenever deterministic rewritings can be found in PTime.