DILUCT: an open-source spanish dependency parser based on rules, heuristics, and selectional preferences

  • Authors:
  • Hiram Calvo;Alexander Gelbukh

  • Affiliations:
  • Natural Language Processing Laboratory, Center for Computing Research, National Polytechnic Institute, Mexico City, Mexico;Natural Language Processing Laboratory, Center for Computing Research, National Polytechnic Institute, Mexico City, Mexico

  • Venue:
  • NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A method for recognizing syntactic patterns for Spanish is presented. This method is based on dependency parsing using heuristic rules to infer dependency relationships between words, and word co-occurrence statistics (learnt in an unsupervised manner) to resolve ambiguities such as prepositional phrase attachment. If a complete parse cannot be produced, a partial structure is built with some (if not all) dependency relations identified. Evaluation shows that in spite of its simplicity, the parser's accuracy is superior to the available existing parsers for Spanish. Though certain grammar rules, as well as the lexical resources used, are specific for Spanish, the suggested approach is language-independent.