Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Crafting a compiler
Algorithms for finding patterns in strings
Handbook of theoretical computer science (vol. A)
Worm-2DPDAs: an extension to 2DPDAs that can be simulated in linear time
Information Processing Letters
Compiler Design
Introduction To Automata Theory, Languages, And Computation
Introduction To Automata Theory, Languages, And Computation
Compiler Construction
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
Compiler Construction, An Advanced Course, 2nd ed.
Principles of Compiler Design (Addison-Wesley series in computer science and information processing)
Principles of Compiler Design (Addison-Wesley series in computer science and information processing)
Two-way finite automata with a write-once track
Journal of Automata, Languages and Combinatorics
Hi-index | 0.00 |
The lexical-analysis (or scanning) phase of a compiler attempts to partition an input string into a sequence of tokens. The convention in most languages is that the input is scanned left to right, and each token identified is a “maximal munch” of the remaining input—the longest prefix of the remaining input that is a token of the language. Although most of the standard compiler textbooks present a way to perform maximal-munch tokenization, the algorithm they describe is one that, for certain sets of token definitions, can cause the scanner to exhibit quadratic behavior in the worst case. In the article, we show that maximal-munch tokenization can always be performed in time linear in the size of the input.