A user-extensible and adaptable parser architecture

Authors:
John Tobin;Carl Vogel
Affiliations:
School of Computer Science and Statistics, Trinity College, Dublin 2, Ireland;School of Computer Science and Statistics, Trinity College, Dublin 2, Ireland
Venue:
Knowledge-Based Systems
Year:
2009

Citing 5
Cited 0

Instance-Based Learning Algorithms

Machine Learning
Program development by stepwise refinement

Communications of the ACM
Transition network grammars for natural language analysis

Communications of the ACM
Partial parsing via finite-state cascades

Natural Language Engineering
Improved error reporting for software that uses black-box components

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Some parsers need to be very precise and strict when parsing, yet must allow users to easily adapt or extend the parser to parse new inputs, without requiring that the user have an in-depth knowledge and understanding of the parser's internal workings. This paper presents a novel parsing architecture, designed for parsing Postfix log files, that aims to make the process of parsing new inputs as simple as possible, enabling users to trivially add new rules (to parse variants of existing inputs) and relatively easily add new actions (to process a previously unknown category of input). The architecture scales linearly or better as the number of rules and size of input increases, making it suitable for parsing large corpora or months of accumulated data.