Tools for large-scale parser development

Authors:
Natural Language Processing Group
Affiliations:
Microsoft Research, One Microsoft Way, Redmond WA
Venue:
Proceedings of the COLING-2000 Workshop on Efficiency In Large-Scale Parsing Systems
Year:
2000

Citing 0
Cited 1

Building a web thesaurus from web link structure

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

We demonstrate the tool set available to linguistic developers in our NLP lab, with a particular emphasis on the tools for incremental regression testing and creation of regression suites. These tools are currently under use in the daily development of broad-coverage language analysis systems for 7 languages (Chinese, English, French, German, Japanese, Korean and Spanish). The system is modular, with the parsing engine and debugging environments shared by all languages. Linguistic rules are written in a proprietary language (called G) whose features are uniquely suited to linguistic tasks (Heidorn, in press). The engine underlying the system, as well as the user interface for linguistic developers, is unicode-enabled thus supporting both European and non-Indo-European languages.