Competitive generative models with structure learning for NLP classification tasks

Authors:
Kristina Toutanova
Affiliations:
Microsoft Research, Redmond, WA
Venue:
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Year:
2006

Citing 27
Cited 5

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
A tutorial on learning with Bayesian networks

Learning in graphical models
Automatic labeling of semantic roles

Computational Linguistics
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Discriminative Reranking for Natural Language Parsing

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Structural ambiguity and lexical relations

Computational Linguistics - Special issue on using large corpora: I
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Learning Bayesian network classifiers by maximizing conditional likelihood

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Joint and conditional estimation of tagging and parsing models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Feature-rich part-of-speech tagging with a cyclic dependency network

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A maximum entropy model for prepositional phrase attachment

HLT '94 Proceedings of the workshop on Human Language Technology
Support Vector Learning for Semantic Argument Classification

Machine Learning
Discriminative versus generative parameter and structure learning of Bayesian network classifiers

ICML '05 Proceedings of the 22nd international conference on Machine learning
Conditional structure versus conditional estimation in NLP models

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
The Proposition Bank: An Annotated Corpus of Semantic Roles

Computational Linguistics
Discriminative language modeling with conditional random fields and the perceptron algorithm

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Semantic role labeling using different syntactic views

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Joint learning improves semantic role labeling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A weighted polynomial information gain kernel for resolving prepositional phrase attachment ambiguities with support vector machines

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
The necessity of syntactic parsing for semantic role labeling

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Introduction to the CoNLL-2005 shared task: semantic role labeling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Learning Bayesian networks with local structure

UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
Efficiently inducing features of conditional random fields

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

WORDS AS CLASSIFIERS OF DOCUMENTS ACCORDING TO THEIR HISTORICAL PERIOD AND THE ETHNIC ORIGIN OF THEIR AUTHORS

Cybernetics and Systems
Discriminative models for semi-supervised natural language learning

SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
Discriminative substring decoding for transliteration

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
A character-based joint model for Chinese word segmentation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Integrating Generative and Discriminative Character-Based Models for Chinese Word Segmentation

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we show that generative models are competitive with and sometimes superior to discriminative models, when both kinds of models are allowed to learn structures that are optimal for discrimination. In particular, we compare Bayesian Networks and Conditional loglinear models on two NLP tasks. We observe that when the structure of the generative model encodes very strong independence assumptions (a la Naive Bayes), a discriminative model is superior, but when the generative model is allowed to weaken these independence assumptions via learning a more complex structure, it can achieve very similar or better performance than a corresponding discriminative model. In addition, as structure learning for generative models is far more efficient, they may be preferable for some tasks.