Joint and conditional estimation of tagging and parsing models

Authors:
Mark Johnson
Affiliations:
Brown University
Venue:
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Year:
2001

Citing 12
Cited 15

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Empirical methods for artificial intelligence

Empirical methods for artificial intelligence
A maximum entropy approach to natural language processing

Computational Linguistics
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical methods for speech recognition

Statistical methods for speech recognition
Introduction To Automata Theory, Languages, And Computation

Introduction To Automata Theory, Languages, And Computation
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars

Computational Linguistics - Special issue on using large corpora: I
PCFG models of linguistic tree representations

Computational Linguistics
Estimators for stochastic "Unification-Based" grammars

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Relating probabilistic grammars and automata

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics

Lessons from a failure: generating tailored smoking cessation letters

Artificial Intelligence
Deep syntactic processing by combining shallow methods

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Conditional structure versus conditional estimation in NLP models

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
The SuperARV language model: investigating the effectiveness of tightly integrating multiple knowledge sources

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
PCFG parsing for restricted classical Chinese texts

SIGHAN '02 Proceedings of the first SIGHAN workshop on Chinese language processing - Volume 18
Discriminative training of a neural network statistical parser

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Contrastive estimation: training log-linear models on unlabeled data

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Competitive generative models with structure learning for NLP classification tasks

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Integrating multi-level linguistic knowledge with a unified framework for Mandarin speech recognition

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Sparse multi-scale grammars for discriminative latent variable parsing

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Efficient extraction of grammatical relations

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Symbolic-to-statistical hybridization: extending generation-heavy machine translation

Machine Translation
Integrating Generative and Discriminative Character-Based Models for Chinese Word Segmentation

ACM Transactions on Asian Language Information Processing (TALIP)
Higher-order constituent parsing and parser combination

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
An information-theoretic measure to evaluate parsing difficulty across treebanks

ACM Transactions on Speech and Language Processing (TSLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper compares two different ways of estimating statistical language models. Many statistical NLP tagging and parsing models are estimated by maximizing the (joint) likelihood of the fully-observed training data. However, since these applications only require the conditional probability distributions, these distributions can in principle be learnt by maximizing the conditional likelihood of the training data. Perhaps somewhat surprisingly, models estimated by maximizing the joint were superior to models estimated by maximizing the conditional, even though some of the latter models intuitively had access to "more information".