Joint parsing and alignment with weakly synchronized grammars

Authors:
David Burkett;John Blitzer;Dan Klein
Affiliations:
University of California, Berkeley;University of California, Berkeley;University of California, Berkeley
Venue:
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Year:
2010

Citing 14
Cited 18

Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Synchronous tree-adjoining grammars

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
Probabilistic CFG with latent annotations

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A discriminative matching approach to word alignment

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Graphical Models, Exponential Families, and Variational Inference

Graphical Models, Exponential Families, and Variational Inference
Two languages are better than one (for syntactic parsing)

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A scalable decoder for parsing-based machine translation with equivalent language model state maintenance

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
Using syntax to improve word alignment precision for syntax-based machine translation

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation
Quasi-synchronous grammars: alignment by soft projection of syntactic dependencies

StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Unsupervised multilingual grammar induction

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Better word alignments with supervised ITG models

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Parser adaptation and projection with quasi-synchronous grammar features

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2

Unsupervised syntactic alignment with inversion transduction grammars

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Discriminative word alignment with a function word reordering model

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
It depends on the translation: unsupervised dependency parsing via word alignment

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Model-based aligner combination using dual decomposition

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Learning to transform and select elementary trees for improved syntax-based machine translations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised multilingual learning

Unsupervised multilingual learning
Feature-rich language-independent syntax-based alignment for statistical machine translation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Training dependency parsers by jointly optimizing multiple objectives

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Multilingual named entity recognition using parallel data and metadata from Wikipedia

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Transforming trees to improve syntactic convergence

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Training factored PCFGs with expectation propagation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Using parallel features in parsing of machine-translated sentences for correction of grammatical errors

SSST-6 '12 Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
Finite-state chart constraints for reduced complexity context-free parsing pipelines

Computational Linguistics
Knowledge sources for constituent parsing of german, a morphologically rich and less-configurational language

Computational Linguistics
Iterative rule segmentation under minimum description length for unsupervised transduction grammar induction

SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
The HERMIT in the stream: fusing stream fusion's concatMap

Proceedings of the ACM SIGPLAN 2014 Workshop on Partial Evaluation and Program Manipulation
Joint Optimization for Chinese POS Tagging and Dependency Parsing

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Unsupervised sub-tree alignment for tree-to-tree translation

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Syntactic machine translation systems extract rules from bilingual, word-aligned, syntactically parsed text, but current systems for parsing and word alignment are at best cascaded and at worst totally independent of one another. This work presents a unified joint model for simultaneous parsing and word alignment. To flexibly model syntactic divergence, we develop a discriminative log-linear model over two parse trees and an ITG derivation which is encouraged but not forced to synchronize with the parses. Our model gives absolute improvements of 3.3 F1 for English parsing, 2.1 F1 for Chinese parsing, and 5.5 F1 for word alignment over each task's independent baseline, giving the best reported results for both Chinese-English word alignment and joint parsing on the parallel portion of the Chinese treebank. We also show an improvement of 1.2 BLEU in downstream MT evaluation over basic HMM alignments.