Interpretation of compound nominalisations using corpus and web statistics

  • Authors:
  • Jeremy Nicholson;Timothy Baldwin

  • Affiliations:
  • University of Melbourne, Australia;University of Melbourne, Australia

  • Venue:
  • MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present two novel paraphrase tests for automatically predicting the inherent semantic relation of a given compound nominalisation as one of subject, direct object, or prepositional object. We compare these to the usual verb-argument paraphrase test using corpus statistics, and frequencies obtained by scraping the Google search engine interface. We also implemented a more robust statistical measure than maximum likelihood estimation --- the confidence interval. A significant reduction in data sparseness was achieved, but this alone is insufficient to provide a substantial performance improvement.