Communication complexity
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
Validating streaming XML documents
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Automata theory for XML researchers
ACM SIGMOD Record
An information statistics approach to data stream and communication complexity
Journal of Computer and System Sciences - Special issue on FOCS 2002
Lower bounds for sorting with few random accesses to external memory
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Trading off space for passes in graph streaming problems
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Randomized computations on large data sets: tight lower bounds
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Tight lower bounds for query processing on streaming and external memory data
Theoretical Computer Science
Lower bounds for randomized read/write stream algorithms
Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
On the Value of Multiple Read/Write Streams for Approximating Frequency Moments
FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
Lower bounds for processing data with few random accesses to external memory
Journal of the ACM (JACM)
Recognizing well-parenthesized expressions in the streaming model
Proceedings of the forty-second ACM symposium on Theory of computing
Constant-memory validation of streaming XML documents against DTDs
ICDT'07 Proceedings of the 11th international conference on Database Theory
The complexity of querying external memory and streaming data
FCT'05 Proceedings of the 15th international conference on Fundamentals of Computation Theory
Deterministic regular expressions in linear time
PODS '12 Proceedings of the 31st symposium on Principles of Database Systems
Deciding definability by deterministic regular expressions
FOSSACS'13 Proceedings of the 16th international conference on Foundations of Software Science and Computation Structures
Validating XML documents in the streaming model with external memory
ACM Transactions on Database Systems (TODS) - Invited papers issue
Hi-index | 0.00 |
We study the problem of validating XML documents of size N against general DTDs in the context of streaming algorithms. The starting point of this work is a well-known space lower bound. There are XML documents and DTDs for which p-pass streaming algorithms require Ω(N/p) space. We show that when allowing access to external memory, there is a deterministic streaming algorithm that solves this problem with memory space O(log2 N), a constant number of auxiliary read/write streams, and O(log N) total number of passes on the XML document and auxiliary streams. An important intermediate step of this algorithm is the computation of the First-Child-Next-Sibling (FCNS) encoding of the initial XML document in a streaming fashion. We study this problem independently, and we also provide memory efficient streaming algorithms for decoding an XML document given in its FCNS encoding. Furthermore, validating XML documents encoding binary trees in the usual streaming model without external memory can be done with sublinear memory. There is a one-pass algorithm using O(√N log N) space, and a bidirectional two-pass algorithm using O(log2 N) space performing this task.