Two baselines for unsupervised dependency parsing

  • Authors:
  • Anders Søgaard

  • Affiliations:
  • University of Copenhagen, Copenhagen S

  • Venue:
  • WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Results in unsupervised dependency parsing are typically compared to branching baselines and the DMV-EM parser of Klein and Manning (2004). State-of-the-art results are now well beyond these baselines. This paper describes two simple, heuristic baselines that are much harder to beat: a simple, heuristic algorithm recently presented in Søgaard (2012) and a heuristic application of the universal rules presented in Naseem et al. (2010). Our first baseline (RANK) outperforms existing baselines, including PR-DVM (Gillenwater et al., 2010), while relying only on raw text, but all submitted systems in the Pascal Grammar Induction Challenge score better. Our second baseline (RULES), however, outperforms several submitted systems.