Wide-coverage semantic analysis with Boxer

  • Authors:
  • Johan Bos

  • Affiliations:
  • University of Rome "La Sapienza", Italy

  • Venue:
  • STEP '08 Proceedings of the 2008 Conference on Semantics in Text Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Boxer is an open-domain software component for semantic analysis of text, based on Combinatory Categorial Grammar (CCG) and Discourse Representation Theory (DRT). Used together with the C&C tools, Boxer reaches more than 95% coverage on newswire texts. The semantic representations produced by Boxer, known as Discourse Representation Structures (DRSs), incorporate a neo-Davidsonian representations for events, using the VerbNet inventory of thematic roles. The resulting DRSs can be translated to ordinary first-order logic formulas and be processing by standard theorem provers for first-order logic. Boxer's performance on the shared task for comparing semantic represtations was promising. It was able to produce complete DRSs for all seven texts. Manually inspecting the output revealed that: (a) the computed predicate argument structure was generally of high quality, in particular dealing with hard constructions involving control or coordination; (b) discourse structure triggered by conditionals, negation or discourse adverbs was overall correctly computed; (c) some measure and time expressions are correctly analysed, others aren't; (d) several shallow analyses are given for lexical phrases that require deep analysis; (e) bridging references and pronouns are not resolved in most cases. Boxer is distributed with the C&C tools and freely available for research purposes.