Introduction to algorithms
From structured documents to novel query facilities
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Proximal nodes: a model to query document databases by content and structure
ACM Transactions on Information Systems (TOIS)
XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Monitoring XML data on the Web
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mesh-based content routing using XML
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Handbook of Formal Languages
Introduction To Automata Theory, Languages, And Computation
Introduction To Automata Theory, Languages, And Computation
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Mind Your Grammar: a New Approach to Modelling Text
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Efficient Filtering of XML Documents for Selective Dissemination of Information
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Efficient Filtering of XML Documents with XPath Expressions
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Hi-index | 0.00 |
XML data is queried with XPath expressions, which are a limited form of regular expressions. New XML stream processing applications, such as content-based routing or selective dissemination of information, require thousands or millions of XPath expressions to be evaluated simultaneously on the incoming XML stream at a high, sustained rate.Con ceptually, the XPath evaluation problem is analogous to the text search problem, in which one or several regular expressions need to be matched to a given text, but the number of regular expressions here is much larger, while the "text" is much shorter, since it corresponds to the depth of the XML stream. In this paper we examine techniques that have been proposed for XML stream processing, which are variations of either a non-deterministic or a deterministic finite automata (NFA and DFA). For the latter, we describe a series or theoretical results establishing lower and upper bounds on the number of DFA states for sets of XPath expressions.