Learning twig and path queries

Authors:
Sławek Staworko;Piotr Wieczorek
Affiliations:
University of Lille, France;University of Wrocław
Venue:
Proceedings of the 15th International Conference on Database Theory
Year:
2012

Citing 31
Cited 5

Learning regular sets from queries and counterexamples

Information and Computation
Inference of k-Testable Languages in the Strict Sense and Application to Syntactic Pattern Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
An introduction to computational learning theory

An introduction to computational learning theory
Characteristic Sets for Polynomial Grammatical Inference

Machine Learning
Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
Inference of Reversible Languages

Journal of the ACM (JACM)
Polynomial-time learning of elementary formal systems

New Generation Computing
Index Structures for Path Expressions

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Polynomial Time Inference of Extended Regular Pattern Languages

Proceedings of RIMS Symposium on Software Science and Engineering
Pattern Inference

Algorithmic Learning for Knowledge-Based Systems, GOSLER Final Report
XPath Containment in the Presence of Disjunction, DTDs, and Variables

ICDT '03 Proceedings of the 9th International Conference on Database Theory
Tree pattern query minimization

The VLDB Journal — The International Journal on Very Large Data Bases
Finding patterns common to a set of strings (Extended Abstract)

STOC '79 Proceedings of the eleventh annual ACM symposium on Theory of computing
Containment and equivalence for a fragment of XPath

Journal of the ACM (JACM)
XPath query containment

ACM SIGMOD Record
A Machine Learning Approach to Rapid Development of XML Mapping Queries

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
XML data exchange: consistency and query answering

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
XPath satisfiability in the presence of DTDs

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Interactive learning of node selecting tree transducer

Machine Learning
Learning (k,l)-contextual tree languages for information extraction from web pages

Machine Learning
Schema-Guided Induction of Monadic Queries

ICGI '08 Proceedings of the 9th international colloquium on Grammatical Inference: Algorithms and Applications
Query by output

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Inference of concise regular expressions and DTDs

ACM Transactions on Database Systems (TODS)
Complexity measures for regular expressions

Journal of Computer and System Sciences
A bibliographical study of grammatical inference

Pattern Recognition
Synthesizing view definitions from data

Proceedings of the 13th International Conference on Database Theory
A learning algorithm for top-down XML transformations

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Learning Deterministic Regular Expressions for the Inference of Schemas from XML Data

ACM Transactions on the Web (TWEB)
Induction of relational algebra expressions

ILP'09 Proceedings of the 19th international conference on Inductive logic programming
Learning n-ary node selecting tree transducers from completely annotated examples

ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications
Query-based learning of XPath expressions

ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications

Certain and possible XPath answers

Proceedings of the 16th International Conference on Database Theory
Learning and verifying quantified boolean queries by example

Proceedings of the 32nd symposium on Principles of database systems
Learning queries for relational, semi-structured, and graph databases

Proceedings of the 2013 Sigmod/PODS Ph.D. symposium on PhD symposium
Query induction with schema-guided pruning strategies

The Journal of Machine Learning Research
Learning schema mappings

ACM Transactions on Database Systems (TODS) - Invited papers issue

Quantified Score

Hi-index	0.00

Visualization

Abstract

We investigate the problem of learning XML queries, path queries and twig queries, from examples given by the user. A learning algorithm takes on the input a set of XML documents with nodes annotated by the user and returns a query that selects the nodes in a manner consistent with the annotation. We study two learning settings that differ with the types of annotations. In the first setting the user may only indicate required nodes that the query must select (i.e., positive examples). In the second, more general, setting, the user may also indicate forbidden nodes that the query must not select (i.e., negative examples). The query may or may not select any node with no annotation. We formalize what it means for a class of queries to be learnable. One requirement is the existence of a learning algorithm that is sound i.e., always returning a query consistent with the examples given by the user. Furthermore, the learning algorithm should be complete i.e., able to produce every query with sufficiently rich examples. Other requirements involve tractability of the learning algorithm and its robustness to nonessential examples. We identify practical classes of Boolean and unary, path and twig queries that are learnable from positive examples. We also show that adding negative examples to the picture renders learning unfeasible.