Annealing structural bias in multilingual weighted grammar induction

Authors:
Noah A. Smith;Jason Eisner
Affiliations:
Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD
Venue:
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Year:
2006

Citing 21
Cited 20

Deterministic annealing EM algorithm

Neural Networks
Statistical Language Learning

Statistical Language Learning
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Head automata and bilingual tiling: translation with minimal representations

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Efficient parsing for bilexical context-free grammars and head automaton grammars

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Learning Hidden Variable Networks: The Information Bottleneck Approach

The Journal of Machine Learning Research
The Penn Chinese TreeBank: Phrase structure annotation of a large corpus

Natural Language Engineering
Bootstrapping statistical parsers from small datasets

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
A generative constituent-context model for improved grammar induction

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatically acquiring phrase structure using distributional analysis

HLT '91 Proceedings of the workshop on Speech and Natural Language
Corpus-based induction of syntactic structure: models of dependency and constituency

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Annealing techniques for unsupervised statistical language learning

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Online large-margin training of dependency parsers

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Contrastive estimation: training log-linear models on unlabeled data

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Bootstrapping without the boot

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
CoNLL-X shared task on multilingual dependency parsing

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Parsing with soft and hard constraints on dependency length

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology

Using Short Dependency Relations from Auto-Parsed Data for Chinese Dependency Parsing

ACM Transactions on Asian Language Information Processing (TALIP)
Superior and efficient fully unsupervised pattern-based concept acquisition using an unsupervised parser

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Automatic selection of high quality parses created by a fully unsupervised parser

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Unsupervised induction of labeled parse trees by clustering with syntactic features

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Better informed training of latent syntactic features

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Improving unsupervised dependency parsing with richer contexts and smoothing

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Variational inference for grammar induction with prior knowledge

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Unsupervised argument identification for Semantic Role Labeling

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Cross language dependency parsing using a bilingual lexicon

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Semi-supervised learning of dependency parsers using generalized expectation criteria

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Dependency grammar induction via bitext projection constraints

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Parser adaptation and projection with quasi-synchronous grammar features

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
From baby steps to Leapfrog: how "Less is More" in unsupervised dependency parsing

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Sparsity in dependency grammar induction

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Improved fully unsupervised parsing with zoomed learning

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Learning tractable word alignment models with complex constraints

Computational Linguistics
Covariance in Unsupervised Learning of Probabilistic Grammars

The Journal of Machine Learning Research
Posterior Sparsity in Unsupervised Dependency Parsing

The Journal of Machine Learning Research
Neutralizing linguistically problematic annotations in unsupervised dependency parsing evaluation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
From ranked words to dependency trees: two-stage unsupervised non-projective dependency parsing

TextGraphs-6 Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We first show how a structural locality bias can improve the accuracy of state-of-the-art dependency grammar induction models trained by EM from unannotated examples (Klein and Manning, 2004). Next, by annealing the free parameter that controls this bias, we achieve further improvements. We then describe an alternative kind of structural bias, toward "broken" hypotheses consisting of partial structures over segmented sentences, and show a similar pattern of improvement. We relate this approach to contrastive estimation (Smith and Eisner, 2005a), apply the latter to grammar induction in six languages, and show that our new approach improves accuracy by 1-17% (absolute) over CE (and 8-30% over EM), achieving to our knowledge the best results on this task to date. Our method, structural annealing, is a general technique with broad applicability to hidden-structure discovery problems.