A "not-so-shallow" parser for collocational analysis

  • Authors:
  • R. Basili;M. T. Pazienza;P. Velardi

  • Affiliations:
  • Università di Roma, Tor Vergata;Università di Roma, Tor Vergata;Università di Ancona

  • Venue:
  • COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

Collocational analysis is the basis of many studies on lexical acquisition. Collocations are extracted from corpora using more or less shallow processing techniques, that span from purely statistical methods to patial parsers. Our point is that, despite one of the objectives of collocational analysis is to acquire high-coverage lexical data at low human cost, this is often not the case. Human work is in fact required for the initial training of most statistically based methods. A more serious problem is that shallow processing techniques produce a noise that is not acceptable for a fully automated system.We propose in this paper a not-so-shallow parsing strategy that reliably detects binary and ternary relations among words. We show that adding more syntactic knowledge to the recipe significantly improves the recall and precision of the detected collocations, regardless of any subsequent statistical computation, while still meeting the computational requirements of corpus parsers.