Tools for large-scale parser development

  • Authors:
  • Natural Language Processing Group

  • Affiliations:
  • Microsoft Research, One Microsoft Way, Redmond WA

  • Venue:
  • Proceedings of the COLING-2000 Workshop on Efficiency In Large-Scale Parsing Systems
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We demonstrate the tool set available to linguistic developers in our NLP lab, with a particular emphasis on the tools for incremental regression testing and creation of regression suites. These tools are currently under use in the daily development of broad-coverage language analysis systems for 7 languages (Chinese, English, French, German, Japanese, Korean and Spanish). The system is modular, with the parsing engine and debugging environments shared by all languages. Linguistic rules are written in a proprietary language (called G) whose features are uniquely suited to linguistic tasks (Heidorn, in press). The engine underlying the system, as well as the user interface for linguistic developers, is unicode-enabled thus supporting both European and non-Indo-European languages.