Linguistically motivated large-scale NLP with C&C and boxer

  • Authors:
  • James R. Curran;Stephen Clark;Johan Bos

  • Affiliations:
  • University of Sydney, NSW, Australia;Oxford University, Oxford, UK;Università di Roma "La Sapienza", Roma, Italy

  • Venue:
  • ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The statistical modelling of language, together with advances in wide-coverage grammar development, have led to high levels of robustness and efficiency in NLP systems and made linguistically motivated large-scale language processing a possibility (Matsuzaki et al., 2007; Kaplan et al., 2004). This paper describes an NLP system which is based on syntactic and semantic formalisms from theoretical linguistics, and which we have used to analyse the entire Gigaword corpus (1 billion words) in less than 5 days using only 18 processors. This combination of detail and speed of analysis represents a break-through in NLP technology.