A Graph Based Method for Building Multilingual Weakly Supervised Dependency Parsers

  • Authors:
  • Jagadeesh Gorla;Anil Kumar Singh;Rajeev Sangal;Karthik Gali;Samar Husain;Sriram Venkatapathy

  • Affiliations:
  • Language Technologies Research Centre, IIIT, Hyderabad, India;Language Technologies Research Centre, IIIT, Hyderabad, India;Language Technologies Research Centre, IIIT, Hyderabad, India;Language Technologies Research Centre, IIIT, Hyderabad, India;Language Technologies Research Centre, IIIT, Hyderabad, India;Language Technologies Research Centre, IIIT, Hyderabad, India

  • Venue:
  • GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The structure of a sentence can be seen as a spanning tree in a linguistically augmented graph of syntactic nodes. This paper presents an approach for unlabeled dependency parsing based on this view. The first step involves marking the chunks and the chunk heads of a given sentence and then identifying the intra-chunk dependency relations. The second step involves learning to identify the inter-chunk dependency relations. For this, we use an initialization technique based on a measure we call Normalized Conditional Mutual Information (NCMI), in addition to a few linguistic constraints. We present the results for Hindi. We have achieved a precision of 80.83% for sentences of size less than 10 words and 66.71% overall. This is significantly better than the baseline in which random initialization is used.