Lower bounds on communication complexity
Information and Computation
SIAM Journal on Computing
Machine models and simulations
Handbook of theoretical computer science (vol. A)
Query evaluation techniques for large databases
ACM Computing Surveys (CSUR)
Communication complexity
Languages, automata, and logic
Handbook of formal languages, vol. 3
The space complexity of approximating the frequency moments
Journal of Computer and System Sciences
Some Results on Tape-Bounded Turing Machines
Journal of the ACM (JACM)
External memory algorithms
Expressive and efficient pattern languages for tree-structured data (extended abstract)
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
External memory algorithms and data structures: dealing with massive data
ACM Computing Surveys (CSUR)
Expressiveness of structured document query languages based on attribute grammars
Journal of the ACM (JACM)
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Validating streaming XML documents
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Characterizing memory requirements for queries over continuous data streams
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Database Management Systems
Query automata over finite trees
Theoretical Computer Science
Automata theory for XML researchers
ACM SIGMOD Record
Processing XML Streams with Deterministic Automata
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Locating Matches of Tree Patterns in Forests
Proceedings of the 18th Conference on Foundations of Software Technology and Theoretical Computer Science
Typing and querying XML documents: some complexity bounds
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The complexity of XPath query evaluation
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Query Evaluation on Compressed Trees (Extended Abstract)
LICS '03 Proceedings of the 18th Annual IEEE Symposium on Logic in Computer Science
Some complexity questions related to distributive computing(Preliminary Report)
STOC '79 Proceedings of the eleventh annual ACM symposium on Theory of computing
Monadic datalog and the expressive power of languages for Web information extraction
Journal of the ACM (JACM)
On the Streaming Model Augmented with a Sorting Primitive
FOCS '04 Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science
On the memory requirements of XPath evaluation over XML streams
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Buffering in query evaluation over XML streams
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Lower bounds for sorting with few random accesses to external memory
Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Randomized computations on large data sets: tight lower bounds
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient algorithms for processing XPath queries
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Tree acceptors and some of their applications
Journal of Computer and System Sciences
Tight lower bounds for query processing on streaming and external memory data
ICALP'05 Proceedings of the 32nd international conference on Automata, Languages and Programming
First order paths in ordered trees
ICDT'05 Proceedings of the 10th international conference on Database Theory
The complexity of querying external memory and streaming data
FCT'05 Proceedings of the 15th international conference on Fundamentals of Computation Theory
Machine models and lower bounds for query processing
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Theoretical Computer Science
ACM Computing Surveys (CSUR)
Proceedings of the 12th International Conference on Database Theory
Lower bounds for processing data with few random accesses to external memory
Journal of the ACM (JACM)
On the Value of Multiple Read/Write Streams for Data Compression
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Earliest query answering for deterministic nested word automata
FCT'09 Proceedings of the 17th international conference on Fundamentals of computation theory
Machine models for query processing
ACM SIGMOD Record
ACM Transactions on Database Systems (TODS)
Memory lower bounds for XPath evaluation over XML streams
Journal of Computer and System Sciences
Validating XML documents in the streaming model with external memory
Proceedings of the 15th International Conference on Database Theory
Validating XML documents in the streaming model with external memory
ACM Transactions on Database Systems (TODS) - Invited papers issue
On the value of multiple read/write streams for data compression
Information Theory, Combinatorics, and Search Theory
Hi-index | 5.23 |
It is generally assumed that databases have to reside in external, inexpensive storage because of their sheer size. Current technology for external storage systems presents us with a reality that, performance-wise, a small number of sequential scans of the data is strictly preferable over random data accesses. Database technology-in particular query processing technology-has developed around a notion of memory hierarchies with layers of greatly varying sizes and access times. It seems that the current technologies scale up to their tasks and are very successful, but on closer investigation it may appear that our theoretical understanding of the problems involved-and of optimal algorithms for these problems-is not quite as developed. Recently, data stream processing has become an object of study by the database management community, but from the viewpoint of database theory, this is really a special case of the query processing problem on data in external storage where we are limited to a single scan of the input data. In the present paper we study a clean machine model for external memory and stream processing. We establish tight bounds for the data complexity of Core XPath evaluation and filtering. We show that the number of scans of the external data induces a strict hierarchy (as long as internal memory space is sufficiently small, e.g., polylogarithmic in the size of the input). We also show that neither joins nor sorting are feasible if the product of the number r(n) of scans of the external memory and the size s(n) of the internal memory buffers is sufficiently small, i.e., of size o(n).