Syntactic analysis of natural language using linguistic rules and corpus-based patterns

  • Authors:
  • Pasi Tapanainen;Timo Järvinen

  • Affiliations:
  • Rank Xerox Research Centre, Grenoble Laboratory;University of Helsinki, Research Unit for Computational Linguistics

  • Venue:
  • COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

We are concerned with the syntactic annotation of unrestricted text. We combine a rule-based analysis with subsequent exploitation of empirical data. The rule-based surface syntactic analyser leaves some amount of ambiguity in the output that is resolved using empirical patterns. We have implemented a system for generating and applying corpus-based patterns. Some patterns describe the main constituents in the sentence and some the local context of the each syntactic function. There are several (partly) reduntant patterns, and the "pattern" parser selects analysis of the sentence that matches the strictest possible pattern(s). The system is applied to an experimental corpus. We present the results and discuss possible refinements of the method from a linguistic point of view.