The effect of semi-supervised learning on parsing long distance dependencies in German and Swedish

Authors:
Anders Søgaard;Christian Rishøj
Affiliations:
Center for Language Technology, University of Copenhagen, Copenhagen S;Center for Language Technology, University of Copenhagen, Copenhagen S
Venue:
IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Year:
2010

Citing 13
Cited 0

Original Contribution: Stacked generalization

Neural Networks
Random Forests

Machine Learning
Three new probabilistic models for dependency parsing: an exploration

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Tri-Training: Exploiting Unlabeled Data Using Three Classifiers

IEEE Transactions on Knowledge and Data Engineering
Large scale semi-supervised linear SVMs

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Non-projective dependency parsing using spanning tree algorithms

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Semisupervised Learning for Computational Linguistics

Semisupervised Learning for Computational Linguistics
Stacking dependency parsers

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Parser combination by reparsing

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Improving parsing accuracy by combining diverse dependency parsers

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Unbounded dependency recovery for parser evaluation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Ensemble models for dependency parsing: cheap and good?

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Chinese chunking with tri-training learning

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper shows how the best data-driven dependency parsers available today [1] can be improved by learning from unlabeled data. We focus on German and Swedish and show that labeled attachment scores improve by 1.5%-2.5%. Error analysis shows that improvements are primarily due to better recovery of long distance dependencies.