Disambiguating noun compounds with latent semantic indexing

  • Authors:
  • Alan M. Buckeridge;Richard F. E. Sutcliffe

  • Affiliations:
  • University of Limerick, Limerick, Ireland;University of Limerick, Limerick, Ireland

  • Venue:
  • COMPUTERM '02 COLING-02 on COMPUTERM 2002: second international workshop on computational terminology - Volume 14
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Technical terms in text often appear as noun compounds, a frequently occurring yet highly ambiguous construction whose interpretation relies on extra-syntactic information. Several statistical methods for disambiguating compounds have been reported in the literature, often with quite impressive results. However, a striking feature of all these approaches is that they rely on the existence of previously seen unambiguous compounds, meaning they are prone to the problem of sparse data. This difficulty has been overcome somewhat through the use of hand-crafted knowledge resources to collect statistics on "concepts" rather than noun tokens, but domain-independence has been sacrificed by doing so. We report here on work investigating the application of Latent Semantic Indexing to provide a robust domain-independent source of the extra-syntactic knowledge necessary for noun compound disambiguation.