On statistical parsing of French with supervised and semi-supervised strategies

  • Authors:
  • Marie Candito;Benoît Crabbé;Djamé Seddah

  • Affiliations:
  • Université Paris, Ufrl et Inria (Alpage), Paris, France;Université Paris, Ufrl et Inria (Alpage), Paris, France;Université Paris, LaLIC et Inria (Alpage), Paris, France

  • Venue:
  • CLAGI '09 Proceedings of the EACL 2009 Workshop on Computational Linguistic Aspects of Grammatical Inference
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper reports results on grammatical induction for French. We investigate how to best train a parser on the French Treebank (Abeillé et al., 2003), viewing the task as a trade-off between generaliz-ability and interpretability. We compare, for French, a supervised lexicalized parsing algorithm with a semi-supervised un-lexicalized algorithm (Petrov et al., 2006) along the lines of (Crabbé and Candito, 2008). We report the best results known to us on French statistical parsing, that we obtained with the semi-supervised learning algorithm. The reported experiments can give insights for the task of grammatical learning for a morphologically-rich language, with a relatively limited amount of training data, annotated with a rather flat structure.