Improvements in unsupervised co-occurrence based parsing

  • Authors:
  • Christian Hänig

  • Affiliations:
  • Daimler AG, Research and Technology, Ulm, Germany

  • Venue:
  • CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an algorithm for unsupervised co-occurrence based parsing that improves and extends existing approaches. The proposed algorithm induces a context-free grammar of the language in question in an iterative manner. The resulting structure of a sentence will be given as a hierarchical arrangement of constituents. Although this algorithm does not use any a priori knowledge about the language, it is able to detect heads, modifiers and a phrase type's different compound composition possibilities. For evaluation purposes, the algorithm is applied to manually annotated part-of-speech tags (POS tags) as well as to word classes induced by an unsupervised part-of-speech tagger.