Solving the problem of cascading errors: approximate Bayesian inference for linguistic annotation pipelines

Authors:
Jenny Rose Finkel;Christopher D. Manning;Andrew Y. Ng
Affiliations:
Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA
Venue:
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Year:
2006

Citing 18
Cited 26

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Probabilistic Networks and Expert Systems

Probabilistic Networks and Expert Systems
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Convergence rates of the Voting Gibbs classifier, with application to Bayesian feature selection

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Parsing inside-out

Parsing inside-out
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
The problem of computing the most probable tree in data-oriented parsing and stochastic tree grammars

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
An integrated, conditional model of information extraction and coreference with application to citation matching

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Dynamic programming for parsing and estimation of stochastic unification-based grammars

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Effective statistical models for syntactic and semantic disambiguation

Effective statistical models for syntactic and semantic disambiguation
Joint learning improves semantic role labeling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning to recognize features of valid textual entailments

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Better k-best parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Introduction to the CoNLL-2005 shared task: semantic role labeling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
A joint model for semantic role labeling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Generalized inference with multiple semantic role labeling systems

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Joint parsing and semantic role labeling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning

Piecewise pseudolikelihood for efficient training of conditional random fields

Proceedings of the 24th international conference on Machine learning
A global joint model for semantic role labeling

Computational Linguistics
Combining Bayesian Networks and Formal Reasoning for Semantic Classification of Student Utterances

Proceedings of the 2007 conference on Artificial Intelligence in Education: Building Technology Rich Learning Contexts That Work
Monte carlo inference and maximization for phrase-based translation

CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Learning with probabilistic features for improved pipeline models

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Bi-directional Joint Inference for Entity Resolution and Segmentation Using Imperatively-Defined Factor Graphs

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Active learning for pipeline models

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Joint parsing and named entity recognition

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Piecewise training for structured prediction

Machine Learning
A global model for joint lemmatization and part-of-speech prediction

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Automatic diacritization for low-resource languages using a hybrid word and consonant CMM

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Convolution kernel over packed parse forest

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Hierarchical sequential learning for extracting opinions and their attributes

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
A probabilistic morphological analyzer for Syriac

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Improving the quality of text understanding by delaying ambiguity resolution

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Probabilistic tree-edit models with structured latent variables for textual entailment and question answering

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Monte Carlo techniques for phrase-based translation

Machine Translation
Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A comparison of loopy belief propagation and dual decomposition for integrated CCG supertagging and parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Fine-grained class label markup of search queries

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Towards a top-down and bottom-up bidirectional approach to joint information extraction

Proceedings of the 20th ACM international conference on Information and knowledge management
Multilayer sequence labeling

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Exact sampling and decoding in high-order hidden Markov models

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Beyond myopic inference in big data pipelines

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Information extraction as a filtering task

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Joint inference of entities, relations, and coreference

Proceedings of the 2013 workshop on Automated knowledge base construction

Quantified Score

Hi-index	0.00

Visualization

Abstract

The end-to-end performance of natural language processing systems for compound tasks, such as question answering and textual entailment, is often hampered by use of a greedy 1-best pipeline architecture, which causes errors to propagate and compound at each stage. We present a novel architecture, which models these pipelines as Bayesian networks, with each low level task corresponding to a variable in the network, and then we perform approximate inference to find the best labeling. Our approach is extremely simple to apply but gains the benefits of sampling the entire distribution over labels at each stage in the pipeline. We apply our method to two tasks -- semantic role labeling and recognizing textual entailment -- and achieve useful performance gains from the superior pipeline architecture.