Search engine statistics beyond the n-gram: application to noun compound bracketing

  • Authors:
  • Preslav Nakov;Marti Hearst

  • Affiliations:
  • University of California, Berkeley, Berkeley, CA;University of California, Berkeley, Berkeley, CA

  • Venue:
  • CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In order to achieve the long-range goal of semantic interpretation of noun compounds, it is often necessary to first determine their syntactic structure. This paper describes an unsupervised method for noun compound bracketing which extracts statistics from Web search engines using a X2 measure, a new set of surface features, and paraphrases. On a gold standard, the system achieves results of 89.34% (baseline 66.80%), which is a sizable improvement over the state of the art (80.70%).