The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Sparsity in dependency grammar induction
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Using universal linguistic knowledge to guide grammar induction
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Neutralizing linguistically problematic annotations in unsupervised dependency parsing evaluation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised dependency parsing without training
Natural Language Engineering
The PASCAL Challenge on Grammar Induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Hi-index | 0.00 |
Results in unsupervised dependency parsing are typically compared to branching baselines and the DMV-EM parser of Klein and Manning (2004). State-of-the-art results are now well beyond these baselines. This paper describes two simple, heuristic baselines that are much harder to beat: a simple, heuristic algorithm recently presented in Søgaard (2012) and a heuristic application of the universal rules presented in Naseem et al. (2010). Our first baseline (RANK) outperforms existing baselines, including PR-DVM (Gillenwater et al., 2010), while relying only on raw text, but all submitted systems in the Pascal Grammar Induction Challenge score better. Our second baseline (RULES), however, outperforms several submitted systems.