Robust multilingual parsing using island grammars

  • Authors:
  • Nikita Synytskyy;James R. Cordy;Thomas R. Dean

  • Affiliations:
  • School of Computing, Queen's University, Kingston, Ontario, Canada K7L 3N6;School of Computing, Queen's University, Kingston, Ontario, Canada K7L 3N6;School of Computing, Queen's University, Kingston, Ontario, Canada K7L 3N6

  • Venue:
  • CASCON '03 Proceedings of the 2003 conference of the Centre for Advanced Studies on Collaborative research
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Any attempt at automated software analysis or modification must be preceded by a comprehension step, i.e. parsing. This task, while often considered straightforward, can in fact be very challenging for some source code. Files that make up web applications serve as an example of such difficult-to-parse artifacts, for two reasons. First, these files often contain several programming languages at once, sometimes with widely varying syntaxes, and intermingled at the statement level. Second, the code routinely contains syntax errors. Understanding such files calls for a robust parser that can handle multiple languages simultaneously.An approach to creating such a parser, based on the concept of island grammars, is presented here. Island grammars have been used in the past for robust parsing and lightweight analysis of software. Some of the features of these grammars make them uniquely fit for parsing multiple languages simultaneously.