Dependency syntax analysis using grammar induction and a lexical categories precedence system

  • Authors:
  • Hiram Calvo;Omar J. Gambino;Alexander Gelbukh;Kentaro Inui

  • Affiliations:
  • Center for Computing Research, IPN, D.F., México and Computational Linguistics, Nara Institute of Science and Technology, Ikoma, Nara, Japan;Center for Computing Research, IPN, D.F., México;Center for Computing Research, IPN, D.F., México and Faculty of Law, Waseda University, Shinjuku-ku, Tokyo, Japan;Computational Linguistics, Nara Institute of Science and Technology, Ikoma, Nara, Japan

  • Venue:
  • CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The unsupervised approach for syntactic analysis tries to discover the structure of the text using only raw text. In this paper we explore this approach using Grammar Inference Algorithms. Despite of still having room for improvement, our approach tries to minimize the effect of the current limitations of some grammar inductors by adding morphological information before the grammar induction process, and a novel system for converting a shallow parse to dependencies, which reconstructs information about inductor's undiscovered heads by means of a lexical categories precedence system. The performance of our parser, which needs no syntactic tagged resources or rules, trained with a small corpus, is 10% below to that of commercial semisupervised dependency analyzers for Spanish, and comparable to the state of the art for English.