Parsing engineering and empirical robustness

  • Authors:
  • Roberto Basili;Fabio Massimo Zanzotto

  • Affiliations:
  • Department of Computer Science, Systems and Production, University of Rome ‘Tor Vergata’, 00133 Rome, Italy e-mail: basili@info.uniroma2.it, zanzotto@info.uniroma2.it;Department of Computer Science, Systems and Production, University of Rome ‘Tor Vergata’, 00133 Rome, Italy e-mail: basili@info.uniroma2.it, zanzotto@info.uniroma2.it

  • Venue:
  • Natural Language Engineering
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Robustness has been traditionally stressed as a general desirable property of any computational model and system. The human NL interpretation device exhibits this property as the ability to deal with odd sentences. However, the difficulties in a theoretical explanation of robustness within the linguistic modelling suggested the adoption of an empirical notion. In this paper, we propose an empirical definition of robustness based on the notion of performance. Furthermore, a framework for controlling the parser robustness in the design phase is presented. The control is achieved via the adoption of two principles: the modularisation, typical of the software engineering practice, and the availability of domain adaptable components. The methodology has been adopted for the production of CHAOS, a pool of syntactic modules, which has been used in real applications. This pool of modules enables a large validation of the notion of empirical robustness, on the one side, and of the design methodology, on the other side, over different corpora and two different languages (English and Italian).