Discriminative Reranking for Natural Language Parsing

Authors:
Michael Collins;Terry Koo
Affiliations:
-;-
Venue:
Computational Linguistics
Year:
2005

Citing 38
Cited 62

A theory of the learnable

Communications of the ACM
Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Robust trainability of single neurons

Journal of Computer and System Sciences
A maximum entropy approach to natural language processing

Computational Linguistics
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Additive models, boosting, and inference for generalized divergences

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Large Margin Classification Using the Perceptron Algorithm

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Direct optimization of margins improves generalization in combined classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Logistic Regression, AdaBoost and Bregman Distances

Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Discriminative Reranking for Natural Language Parsing

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
An Efficient Boosting Algorithm for Combining Preferences

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Maximum entropy models for natural language ambiguity resolution

Maximum entropy models for natural language ambiguity resolution
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Stochastic attribute-value grammars

Computational Linguistics
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Estimators for stochastic "Unification-Based" grammars

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
New ranking algorithms for parsing and tagging: kernels over discrete structures, and the voted perceptron

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Parsing the wall street journal using a Lexical-Functional Grammar and discriminative estimation techniques

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Ranking algorithms for named-entity extraction: boosting and the voted perceptron

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
SPoT: a trainable sentence planner

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Inducing history representations for broad coverage statistical parsing

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Towards history-based grammars: using richer models for probabilistic parsing

HLT '91 Proceedings of the workshop on Speech and Natural Language
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Using LTAG based features in parse reranking

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Investigating loss functions and optimization methods for discriminative learning of label sequences

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A fast algorithm for feature selection in conditional maximum entropy modeling

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Parameter estimation for statistical parsing models: theory and practice of distribution-free methods

New developments in parsing technology
Statistical parsing with a context-free grammar and word statistics

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Efficiently inducing features of conditional random fields

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Contextual search and name disambiguation in email using graphs

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Confidence estimation for NLP applications

ACM Transactions on Speech and Language Processing (TSLP)
Approximation lasso methods for language modeling

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Japanese dependency parsing using co-occurrence information and a combination of case elements

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Morphology and reranking for the statistical parsing of Spanish

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Discriminative classifiers for deterministic dependency parsing

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Improving Speech Recognition and Understanding using Error-Corrective Reranking

ACM Transactions on Asian Language Information Processing (TALIP)
Learning to rank typed graph walks: local and global approaches

Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis
Relaxation Labeling for Selecting and Exploiting Efficiently Non-local Dependencies in Sequence Labeling

PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
A Joint Segmenting and Labeling Approach for Chinese Lexical Analysis

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Beyond log-linear models: boosted minimum error rate training for N-best Re-ranking

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
A lattice-based framework for enhancing statistical parsers with information from unlabeled corpora

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
A fast boosting-based learner for feature-rich tagging and chunking

CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
When is self-training effective for parsing?

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Perceptron training for a wide-coverage lexicalized-grammar parser

DeepLP '07 Proceedings of the Workshop on Deep Linguistic Processing
Predicting the fluency of text with shallow structural features: case studies of machine translation and human-written text

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Parsing coordinations

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Loss minimization in parse reranking

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
A generative model for parsing natural language to meaning representations

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Learning graph walk based similarity measures for parsed text

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Combining constituent parsers

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Making grammar-based generation easier to deploy in dialogue systems

SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Database-text alignment via structured multilabel classification

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
How the statistical revolution changes (computational) linguistics

ILCL '09 Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous?
Constituent parsing by classification

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
A graphical framework for contextual search and name disambiguation in email

TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
A ranking approach to stress prediction for letter-to-phoneme conversion

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Semantic tagging of web search queries

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Practical grammar-based NLG from examples

INLG '08 Proceedings of the Fifth International Natural Language Generation Conference
Reranking the Berkeley and brown parsers

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
An extractive supervised two-stage method for sentence compression

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Faster parsing by supertagger adaptation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
N-best reranking by multitask learning

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Utilizing extra-sentential context for parsing

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Using web-scale N-grams to improve base NP parsing performance

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Improving graph-walk-based similarity with reranking: Case studies for personal information management

ACM Transactions on Information Systems (TOIS)
Structural features for predicting the linguistic quality of text: applications to machine translation, automatic summarization and human-authored text

Empirical methods in natural language generation
From layout to semantic: a reranking model for mapping web documents to mediated XML representations

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Web-scale features for full-scale parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Ordering prenominal modifiers with a reranking approach

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Parsing the internal structure of words: a new paradigm for Chinese word segmentation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Joint reranking of parsing and word recognition with automatic segmentation

Computer Speech and Language
Adding smarter systems instead of human annotators: re-ranking for system combination

Proceedings of the 1st international workshop on Search and mining entity-relationship data
Regularized least-squares for parse ranking

IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
A generate and rank approach to sentence paraphrasing

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hypotheses selection criteria in a reranking framework for spoken language understanding

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Parse correction with specialized models for difficult attachment types

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Syntactic language modeling with formal grammars

Speech Communication
Efficient training of discriminative language models by sample selection

Speech Communication
Empirical comparisons of various discriminative language models for speech recognition

ROCLING '11 Proceedings of the 23rd Conference on Computational Linguistics and Speech Processing
A history-based matching approach to identification of framework evolution

Proceedings of the 34th International Conference on Software Engineering
Tree representations in probabilistic models for extended named entities detection

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Optimized online rank learning for machine translation

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Low-dimensional discriminative reranking

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Phrase-based approach for adaptive tokenization

SIGMORPHON '12 Proceedings of the Twelfth Meeting of the Special Interest Group on Computational Morphology and Phonology
Spectral dependency parsing with latent variables

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
A coherence model based on syntactic patterns

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
A reranking model for discourse segmentation using subtree features

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Knowledge sources for constituent parsing of german, a morphologically rich and less-configurational language

Computational Linguistics
Word segmentation, unknown-word resolution, and morphological agreement in a hebrew parsing system

Computational Linguistics
Exploiting discourse information to identify paraphrases

Expert Systems with Applications: An International Journal
Generation of compound words in statistical machine translation into compounding languages

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon this initial ranking, using additional features of the tree as evidence. The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define a derivation or a generative model which takes these features into account. We introduce a new method for the reranking task, based on the boosting approach to ranking problems described in Freund et al. (1998). We apply the boosting method to parsing the Wall Street Journal treebank. The method combined the log-likelihood under a baseline model (that of Collins [1999]) with evidence from an additional 500,000 features over parse trees that were not included in the original model. The new model achieved 89.75% F-measure, a 13% relative decrease in F measure error over the baseline model's score of 88.2%. The article also introduces a new algorithm for the boosting approach which takes advantage of the sparsity of the feature space in the parsing data. Experiments show significant efficiency gains for the new algorithm over the obvious implementation of the boosting approach. We argue that the method is an appealing alternative—in terms of both simplicity and efficiency—to work on feature selection methods within log-linear (maximum-entropy) models. Although the experiments in this article are on natural language parsing (NLP), the approach should be applicable to many other NLP problems which are naturally framed as ranking tasks, for example, speech recognition, machine translation, or natural language generation.