Recognizing substrings of LR(k) languages in linear time

  • Authors:
  • Joseph Bates;Alon Lavie

  • Affiliations:
  • Carnegie Mellon Univ., Pittsburgh, PA;Carnegie Mellon Univ., Pittsburgh, PA

  • Venue:
  • ACM Transactions on Programming Languages and Systems (TOPLAS)
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

LR parsing techniques have long been studied as being efficient and powerful methods for processing context-free languages. A linear-time algorithm for recognizing languages representable by LR(k) grammars has long been known. Recognizing substrings of a context-free language is at least as hard as recognizing full strings of the language, since the latter problem easily reduces to the former. In this article we present a linear-time algorithm for recognizing substrings of LR(k) languages, thus showing that the substring recognition problem for these languages is no harder than the full string recognition problem. An interesting data structure, the Forest-Structured Stack, allows the algorithm to track all possible parses of a substring without loosing the efficiency of the original LR parser. We present the algorithm, prove its correctness, analyze its complexity, and mention several applications that have been constructed.