Head-Driven Statistical Models for Natural Language Parsing

Authors:
Michael Collins
Affiliations:
-
Venue:
Computational Linguistics
Year:
2003

Citing 41
Cited 104

Procedure for quantitatively comparing the syntactic coverage of English grammars

HLT '91 Proceedings of the workshop on Speech and Natural Language
TINA: a natural language system for spoken language applications

Computational Linguistics
Training and scaling preference functions for disambiguation

Computational Linguistics
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Introduction to Automata Theory, Languages and Computability

Introduction to Automata Theory, Languages and Computability
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Discriminative Reranking for Natural Language Parsing

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
An Efficient Boosting Algorithm for Combining Preferences

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Probabilistic top-down parsing and language modeling

Computational Linguistics
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A novel use of statistical parsing to extract information from text

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Assigning function tags to parsed text

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Automatic learning for semantic collocation

ANLC '92 Proceedings of the third conference on Applied natural language processing
Nymble: a high-performance learning name-finder

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Learning parse and translation decisions from examples with rich context

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Exploiting syntactic structure for language modeling

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Structural ambiguity and lexical relations

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Automatic grammar induction and parsing free text: a transformation-based approach

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Statistical decision-tree models for parsing

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Head automata and bilingual tiling: translation with minimal representations

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A new statistical parser based on bigram lexical dependencies

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Disambiguation of super parts of speech (or supertags): almost parsing

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Probabilistic tree-adjoining grammar as a framework for statistical natural language processing

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Stochastic lexicalized tree-adjoining grammars

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Three new probabilistic models for dependency parsing: an exploration

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Efficient parsing for bilexical context-free grammars and head automaton grammars

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A statistical parser for Czech

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
What is the minimal set of fragments that achieves maximal parse accuracy?

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Immediate-head parsing for language models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Statistical parsing with an automatically-extracted tree adjoining grammar

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Towards history-based grammars: using richer models for probabilistic parsing

HLT '91 Proceedings of the workshop on Speech and Natural Language
The Penn Treebank: annotating predicate argument structure

HLT '94 Proceedings of the workshop on Human Language Technology
Decision tree parsing using a hidden derivation model

HLT '94 Proceedings of the workshop on Human Language Technology
A statistical model for parsing and word-sense disambiguation

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Conditional structure versus conditional estimation in NLP models

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Transformational priors over grammars

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
The effect of alternative tree representations on tree bank grammars

NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Statistical parsing with a context-free grammar and word statistics

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Sentence Fusion for Multidocument News Summarization

Computational Linguistics
Advances in discriminative parsing

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Morphology and reranking for the statistical parsing of Spanish

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Fully parsing the Penn Treebank

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Cross-entropy and estimation of probabilistic context-free grammars

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Deep semantic interpretations of legal texts

Proceedings of the 11th international conference on Artificial intelligence and law
Abstractive headline generation using WIDL-expressions

Information Processing and Management: an International Journal
Feature forest models for probabilistic hpsg parsing

Computational Linguistics
Using automatically labelled examples to classify rhetorical relations: An assessment

Natural Language Engineering
Automatic Translation in Two Phases: Recognition and Interpretation

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part II
Natural Language Processing Across Time: An Empirical Investigation on Italian

GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Reconstructing Hard Problems in a Human-Readable and Machine-Processable Way

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Detecting Protein-Protein Interactions in Biomedical Texts Using a Parser and Linguistic Resources

CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Probabilistic Classifications with TBL

CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Linguistically motivated large-scale NLP with C&C and boxer

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Are morpho-syntactic features more predictive for the resolution of noun phrase coordination ambiguity than lexico-semantic similarity scores?

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Exploiting constituent dependencies for tree kernel-based semantic relation extraction

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Relational-realizational parsing

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Incremental parsing models for dialog task structure

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
An alignment algorithm using belief propagation and a structure-based distortion model

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
SPMT: statistical machine translation with syntactified target language phrases

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
A generative model for parsing natural language to meaning representations

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Two languages are better than one (for syntactic parsing)

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Automatic prediction of parser accuracy

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Natural language generation for text-to-text applications using an information-slim representation

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 4
Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hierarchical Bayesian domain adaptation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Parsing three German treebanks: lexicalized and unlexicalized baselines

PaGe '08 Proceedings of the Workshop on Parsing German
Adapting WSJ-trained parsers to the British National Corpus using in-domain self-training

IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Improving the efficiency of a wide-coverage CCG parser

IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Three-dimensional parametrization for parsing morphologically rich languages

IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Generation in machine translation from deep syntactic trees

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Porting a lexicalized-grammar parser to the biomedical domain

Journal of Biomedical Informatics
How the statistical revolution changes (computational) linguistics

ILCL '09 Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous?
Corrective modeling for non-projective dependency parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Better k-best parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Constituent parsing by classification

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
An Arabic Slot Grammar parser

Semitic '07 Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources
Brutus: a semantic role labeling system incorporating CCG, CFG, and dependency features

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
A novel discourse parser based on support vector machine classification

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Interactive predictive parsing

IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Synchronous tree adjoining machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
An alternative to head-driven approaches for parsing a (relatively) free word-order language

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Semi-supervised learning for semantic relation classification using stratified sampling strategy

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Probabilistic head-driven parsing for discourse structure

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Turn-yielding cues in task-oriented dialogue

SIGDIAL '09 Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Accurate unlexicalized parsing for modern Hebrew

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
A simulated shallow dependency parser based on weighted hierarchical structure learning

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
The role of PP attachment in preposition generation

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
OntoGene in BioCreative II.5

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
A treebank query system based on an extracted tree grammar

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improving syntactic coordination resolution using language modeling

HLT-SRWS '10 Proceedings of the NAACL HLT 2010 Student Research Workshop
Hierarchical search for word alignment

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Viterbi training for PCFGs: hardness results and competitiveness of uniform initialization

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
SemEval-2010 task 12: Parser evaluation using textual entailments

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Aiding pronoun translation with co-reference resolution

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
No sentence is too confusing to ignore

NLPLING '10 Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground
On dual decomposition and linear programming relaxations for natural language processing

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Top-down nearly-context-sensitive parsing

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Unsupervised induction of tree substitution grammars for dependency parsing

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Better Arabic parsing: baselines, evaluations, and analysis

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Exploring variations across biomedical subdomains

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Joint parsing and translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Mining protein interactions from text using convolution kernels

PAKDD'09 Proceedings of the 13th Pacific-Asia international conference on Knowledge discovery and data mining: new frontiers in applied data mining
Lightweight parsing of classifications into lightweight ontologies

ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
A survey of paraphrasing and textual entailment methods

Journal of Artificial Intelligence Research
Tree topological features for unlexicalized parsing

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Confidence measures for error discrimination in an interactive predictive parsing framework

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Covariance in Unsupervised Learning of Probabilistic Grammars

The Journal of Machine Learning Research
Turn-taking cues in task-oriented dialogue

Computer Speech and Language
Tamil dependency parsing: results using rule based and corpus based approaches

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
An analysis of tree topological features in classifier-based unlexicalized parsing

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Adjoining tree-to-string translation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Parsing the internal structure of words: a new paradigm for Chinese word segmentation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Improving dependency parsing with semantic classes

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Employing Constituent Dependency Information for Tree Kernel-Based Semantic Relation Extraction between Named Entities

ACM Transactions on Asian Language Information Processing (TALIP)
Joint reranking of parsing and word recognition with automatic segmentation

Computer Speech and Language
The generative power of probabilistic and weighted context-free grammars

MOL'11 Proceedings of the 12th biennial conference on The mathematics of language
A text-based decision support system for financial sequence prediction

Decision Support Systems
Unsupervised multilingual learning

Unsupervised multilingual learning
Parsing noun phrases in the penn treebank

Computational Linguistics
Splittability of bilexical context-free grammars is undecidable

Computational Linguistics
Applying COGEX to recognize textual entailment

MLCW'05 Proceedings of the First international conference on Machine Learning Challenges: evaluating Predictive Uncertainty Visual Object Classification, and Recognizing Textual Entailment
Relaxed cross-lingual projection of constituent syntax

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Adaptation of data and models for probabilistic parsing of portuguese

PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
A machine learning parser using an unlexicalized distituent model

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Features for phrase-structure reranking from dependency parses

IWPT '11 Proceedings of the 12th International Conference on Parsing Technologies
Semantic role labeling for portuguese --- a preliminary approach ---

PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Understanding script-based stories using commonsense reasoning

Cognitive Systems Research
Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies

Journal of Biomedical Informatics
Efficient Graph Kernels for Textual Entailment Recognition

Fundamenta Informaticae - RCRA 2009 Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion
Capturing paradigmatic and syntagmatic lexical relations: towards accurate Chinese part-of-speech tagging

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Bayesian symbol-refined tree substitution grammars for syntactic parsing

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Head-driven hierarchical phrase-based translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Three dependency-and-boundary models for grammar induction

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
A novel discriminative framework for sentence-level discourse analysis

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Parser showdown at the wall street corral: an empirical investigation of error types in parser output

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Using syntactic head information in hierarchical phrase-based translation

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
An information-theoretic measure to evaluate parsing difficulty across treebanks

ACM Transactions on Speech and Language Processing (TSLP)
Deep parsing in Watson

IBM Journal of Research and Development
Sentence fusion for multidocument news summarization

Computational Linguistics
Parser evaluation using textual entailments

Language Resources and Evaluation
Incremental, predictive parsing with psycholinguistically motivated tree-adjoining grammar

Computational Linguistics
Freedom through constraints: User-oriented architectural design

Advanced Engineering Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article describes three statistical models for natural language parsing. The models extend methods from probabilistic context-free grammars to lexicalized grammars, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down derivation of the tree. Independence assumptions then lead to parameters that encode the X-bar schema, subcategorization, ordering of complements, placement of adjuncts, bigram lexical dependencies, wh-movement, and preferences for close attachment. All of these preferences are expressed by probabilities conditioned on lexical heads. The models are evaluated on the Penn Wall Street Journal Treebank, showing that their accuracy is competitive with other models in the literature. To gain a better understanding of the models, we also give results on different constituent types, as well as a breakdown of precision/recall results in recovering various types of dependencies. We analyze various characteristics of the models through experiments on parsing accuracy, by collecting frequencies of various structures in the treebank, and through linguistically motivated examples. Finally, we compare the models to others that have been applied to parsing the treebank, aiming to give some explanation of the difference in performance of the various models.