Introduction to algorithms
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
Principles of programming with complex objects and collection types
ICDT '92 Selected papers of the fourth international conference on Database theory
Data on the Web: from relations to semistructured data and XML
Data on the Web: from relations to semistructured data and XML
XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Programming Techniques: Regular expression search algorithm
Communications of the ACM
Monitoring XML data on the Web
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mesh-based content routing using XML
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Handbook of Formal Languages
Database System Implementation
Database System Implementation
Introduction To Automata Theory, Languages, And Computation
Introduction To Automata Theory, Languages, And Computation
Optimizing Regular Path Expressions Using Graph Schemas
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Adding Structure to Unstructured Data
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Processing XML Streams with Deterministic Automata
ICDT '03 Proceedings of the 9th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient Filtering of XML Documents for Selective Dissemination of Information
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Everything You Ever Wanted to Know About DTDs, But Were Afraid to Ask (Extended Abstract)
Selected papers from the Third International Workshop WebDB 2000 on The World Wide Web and Databases
An XML query engine for network-bound data
The VLDB Journal — The International Journal on Very Large Data Bases
The view selection problem for XML content based routing
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Stream processing of XPath queries with predicates
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
XPath queries on streaming data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Efficient Filtering of XML Documents with XPath Expressions
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Light-weight xPath processing of XML stream with deterministic automata
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Path sharing and predicate evaluation for high-performance XML filtering
ACM Transactions on Database Systems (TODS)
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Implementing and using finite automata toolkits
Natural Language Engineering
A transducer-based XML query processor
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Query processing for high-volume XML message brokering
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
The BEA/XQRL streaming XQuery processor
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
AFilter: adaptable XML filtering with prefix-caching suffix-clustering
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Fast and memory-efficient regular expression matching for deep packet inspection
Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems
TDX: a high-performance table-driven XML parser
Proceedings of the 44th annual Southeast regional conference
Efficient algorithms for evaluating xpath over streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Massively multi-query join processing in publish/subscribe systems
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Machine models and lower bounds for query processing
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Dynamic access-control policies on XML encrypted data
ACM Transactions on Information and System Security (TISSEC)
Efficiently Querying Large XML Data Repositories: A Survey
IEEE Transactions on Knowledge and Data Engineering
Early profile pruning on XML-aware publish-subscribe systems
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient processing of branch queries for high-performance XML filtering
Proceedings of the 2nd international conference on Scalable information systems
Xml data dissemination using automata on top of structured overlay networks
Proceedings of the 17th international conference on World Wide Web
Scalable regular expression matching on data streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Modeling and optimization of scientific workflows
Ph.D. '08 Proceedings of the 2008 EDBT Ph.D. workshop
Semantic query optimization for processing XML streams with minimized memory footprint
DataX '08 Proceedings of the 2008 EDBT workshop on Database technologies for handling XML information on the web
XML-document-filtering automaton
Proceedings of the VLDB Endowment
Scientific workflow design for mere mortals
Future Generation Computer Systems
Schema-conscious filtering of XML documents
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Selectivity-sensitive shared evaluation of multiple continuous XPath queries over XML streams
Information Sciences: an International Journal
Reasoning about XML update constraints
Journal of Computer and System Sciences
Input-sensitive scalable continuous join query processing
ACM Transactions on Database Systems (TODS)
Efficient algorithms for descendant-only tree pattern queries
Information Systems
Fast XML document filtering by sequencing twig patterns
ACM Transactions on Internet Technology (TOIT)
Worst-case optimal algorithm for XPath evaluation over XML streams
Journal of Computer and System Sciences
Feedback-driven result ranking and query refinement for exploring semi-structured data collections
Proceedings of the 13th International Conference on Extending Database Technology
Processing XPath queries with forward and downward axes over XML streams
Proceedings of the 13th International Conference on Extending Database Technology
Evaluating xpath queries on XML data streams
BNCOD'07 Proceedings of the 24th British national conference on Databases
Efficient algorithms for the tree homeomorphism problem
DBPL'07 Proceedings of the 11th international conference on Database programming languages
Visibly pushdown transducers for approximate validation of streaming XML
FoIKS'08 Proceedings of the 5th international conference on Foundations of information and knowledge systems
Parallelizing XML data-streaming workflows via MapReduce
Journal of Computer and System Sciences
Machine models for query processing
ACM SIGMOD Record
Efficient evaluation of generalized tree-pattern queries on XML streams
The VLDB Journal — The International Journal on Very Large Data Bases
XPath whole query optimization
Proceedings of the VLDB Endowment
XEvolve: an XML schema evolution framework
Proceedings of the 2011 ACM Symposium on Applied Computing
Efficient event stream processing: handling ambiguous events and patterns with negation
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications
Memory lower bounds for XPath evaluation over XML streams
Journal of Computer and System Sciences
Mixing bottom-up and top-down XPath query evaluation
ADBIS'11 Proceedings of the 15th international conference on Advances in databases and information systems
A pushdown machine for recursive XML processing
APLAS'06 Proceedings of the 4th Asian conference on Programming Languages and Systems
Accelerating XML query matching through custom stack generation on FPGAs
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Optimizing XML querying using type-based document projection
ACM Transactions on Database Systems (TODS)
A survey on XML streaming evaluation techniques
The VLDB Journal — The International Journal on Very Large Data Bases
Optimized XPath evaluation for schema-compressed XML data
ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124
Efficient parsing-based search over structured data
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A study on parallelizing XML path filtering using accelerators
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
We consider the problem of evaluating a large number of XPath expressions on a stream of XML packets. We contribute two novel techniques. The first is to use a single Deterministic Finite Automaton (DFA). The contribution here is to show that the DFA can be used effectively for this problem: in our experiments we achieve a constant throughput, independently of the number of XPath expressions. The major issue is the size of the DFA, which, in theory, can be exponential in the number of XPath expressions. We provide a series of theoretical results and experimental evaluations that show that the lazy DFA has a small number of states, for all practical purposes. These results are of general interest in XPath processing, beyond stream processing. The second technique is the Streaming IndeX (SIX), which consists of adding a small amount of binary data to each XML packet that allows the query processor to achieve significant speedups. As an application of these techniques we describe the XML Toolkit (XMLTK), a collection of command-line tools providing highly scalable XML data processing.