Statistical decision-tree models for parsing

Authors:
David M. Magerman
Affiliations:
Bolt Beranek and Newman Inc., Cambridge, MA
Venue:
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Year:
1995

Citing 3
Cited 152

Procedure for quantitatively comparing the syntactic coverage of English grammars

HLT '91 Proceedings of the workshop on Speech and Natural Language
Class-based n-gram models of natural language

Computational Linguistics
Natural language parsing as statistical pattern recognition

Natural language parsing as statistical pattern recognition

Using Decision Trees to Construct a Practical Parser

Machine Learning - Special issue on natural language learning
Learning to Parse Natural Language with Maximum Entropy Models

Machine Learning - Special issue on natural language learning
Phrase recognition and expansion for short, precision-biased queries based on a query log

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Improving learning by choosing examples intelligently in two natural language tasks

Learning language in logic
Logic-based machine learning

Logic-based artificial intelligence
Summarization beyond sentence extraction: a probabilistic approach to sentence compression

Artificial Intelligence
Implementing a Semantic Lexicon

ICCS '99 Proceedings of the 7th International Conference on Conceptual Structures: Standards and Practices
A Theory of Stochastic Grammars

NLP '00 Proceedings of the Second International Conference on Natural Language Processing
Memory-based shallow parsing

The Journal of Machine Learning Research
Bottom-up relational learning of pattern matching rules for information extraction

The Journal of Machine Learning Research
Learning cross-document structural relationships using boosting

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Adaptive multilingual sentence boundary disambiguation

Computational Linguistics
Review of "Industrial parsing of software manuals" by Richard F. E. Sutcliffe, Heinz-Detlev Koth, and Annette McElligott. Editions Rodopi 1996.

Computational Linguistics
Supertagging: an approach to almost parsing

Computational Linguistics
Open-domain textual question answering techniques

Natural Language Engineering
Do all fragments count?

Natural Language Engineering
A test of the leaf-ancestor metric for parse accuracy

Natural Language Engineering
Robustness beyond shallowness: incremental deep parsing

Natural Language Engineering
A lightweight dependency analyzer for partial parsing

Natural Language Engineering
Acquisitions and applications of structure preference relations in Chinese

Natural Language Engineering
Evaluating two methods for Treebank grammar compaction

Natural Language Engineering
A definition and short history of Language Engineering

Natural Language Engineering
The automatic translation of discourse structures

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A dependency-based method for evaluating broad-coverage parsers

Natural Language Engineering
The DINOUS parser

Natural Language Engineering
Learning probabilistic subcategorization preference by identifying case dependencies and optimal noun class generalization level

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
New models for improving supertag disambiguation

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Intonational boundaries, speech repairs and discourse markers: modeling spoken dialog

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Learning parse and translation decisions from examples with rich context

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
General-to-specific model selection for subcategorization preference

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Trigger-pair predictors in parsing and tagging

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Experiments with Learning Parsing Heuristics

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Using decision trees to construct a practical parser

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
An empirical evaluation of Probabilistic Lexicalized Tree Insertion Grammars

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A fully statistical approach to natural language interfaces

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A new statistical parser based on bigram lexical dependencies

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Compilation of weighted finite-state transducers from decision trees

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Beyond skeleton parsing: producing a comprehensive large-scale general-English treebank with full grammatical analysis

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Three new probabilistic models for dependency parsing: an exploration

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Towards a more careful evaluation of broad coverage parsing systems

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
A statistical theory of dependency syntax

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Backward beam search algorithm for dependency analysis of Japanese

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Automatic corpus-based Thai word extraction with the c4.5 learning algorithm

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A method for accelerating CFG-parsing by using dependency information

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Local context templates for Chinese constituent boundary prediction

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A decision-based approach to rhetorical parsing

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Efficient parsing for bilexical context-free grammars and head automaton grammars

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A statistical parser for Czech

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Converting dependency structures to phrase structures

HLT '01 Proceedings of the first international conference on Human language technology research
Facilitating treebank annotation using a statistical parser

HLT '01 Proceedings of the first international conference on Human language technology research
Recovering latent information in treebanks

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Probabilistic models of verb-argument structure

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
What is the minimal set of fragments that achieves maximal parse accuracy?

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Immediate-head parsing for language models

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
The role of lexico-semantic feedback in open-domain textual question-answering

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Edit detection and parsing for transcribed speech

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Sentence level discourse parsing using syntactic and lexical information

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Incorporating compositional evidence in memory-based partial parsing

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Tree-gram parsing lexical dependencies and structural relations

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Statistical parsing with an automatically-extracted tree adjoining grammar

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
An information-theory-based feature type analysis for the modelling of statistical parsing

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Head-Driven Statistical Models for Natural Language Parsing

Computational Linguistics
Large-Scale Induction and Evaluation of Lexical Resources from the Penn-II and Penn-III Treebanks

Computational Linguistics
Overfitting avoidance for stochastic modeling of attribute-value grammars

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
A uniform method of grammar extraction and its applications

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
A statistical model for parsing and word-sense disambiguation

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Feature selection for a rich HPSG grammar using decision trees

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
A sentence reduction using syntax control

AsianIR '03 Proceedings of the sixth international workshop on Information retrieval with Asian languages - Volume 11
Progress in information extraction

TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
The effect of rhythm on structural disambiguation in Chinese

SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Antecedent recovery: experiments with a trace tagger

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Use of deep linguistic features for the recognition and labeling of semantic arguments

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Automated extraction of tags from the penn treebank

New developments in parsing technology
Lexicalization in crosslinguistic probabilistic parsing: the case of French

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
What to do when lexicalization fails: parsing German with suffix analysis and smoothing

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Advances in discriminative parsing

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Deterministic dependency parsing of English text

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Hierarchical Phrase-Based Translation

Computational Linguistics
Discriminative classifiers for deterministic dependency parsing

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Exploring the potential of intractable parsers

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Design of a multi-lingual, parallel-processing statistical parsing engine

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Dependency parsing of turkish

Computational Linguistics
Memory-based clause identification

ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Syntactic complexity measures for detecting mild cognitive impairment

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
CoNLL-X shared task on multilingual dependency parsing

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Parsing the SynTagRus treebank of Russian

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Linguistic theory in statistical language learning

NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Unsupervised methods for head assignments

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Fast full parsing by linear-chain conditional random fields

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Learning phrasal categories

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Better informed training of latent syntactic features

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Robust parsing using a hidden Markov model

FSMNLP '09 Proceedings of the International Workshop on Finite State Methods in Natural Language Processing
A statistical constraint dependency grammar (CDG) parser

IncrementParsing '04 Proceedings of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together
Sentence realisation from bag of words with dependency constraints

SRWS '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Simple training of dependency parsers via structured boosting

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Factored A* search for models over sequences and trees

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
How the statistical revolution changes (computational) linguistics

ILCL '09 Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous?
Head-driven PCFGs with latent-head statistics

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Chunk parsing revisited

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Constituent parsing by classification

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Bayesian learning of a tree substitution grammar

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
A novel discourse parser based on support vector machine classification

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Construction of a German HPSG grammar from a detailed treebank

GEAF '09 Proceedings of the 2009 Workshop on Grammar Engineering Across Frameworks
Parsing formal languages using natural language parsing techniques

IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
An alternative to head-driven approaches for parsing a (relatively) free word-order language

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Practical grammar-based NLG from examples

INLG '08 Proceedings of the Fifth International Natural Language Generation Conference
Broad-coverage parsing using human-like memory constraints

Computational Linguistics
Expected dependency pair match: predicting translation quality with expected syntactic structure

Machine Translation
Syntactic parsing with hierarchical modeling

AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Pseudo context-sensitive models for parsing isolating languages: classical Chinese-a case study

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Hard constraints for grammatical function labelling

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Constituency to dependency translation with forests

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A cognitive interactionist sentence parser with simple recurrent networks

Information Sciences: an International Journal
Tree-bank grammars

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Statistical parsing with a context-free grammar and word statistics

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Statistical parsing of morphologically rich languages (SPMRL): what, how and whither

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Factors affecting the accuracy of Korean parsing

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Parsing word clusters

SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
A semi-supervised approach to improve classification of infrequent discourse relations using feature vector extension

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Conditional random fields based label sequence and information feedback

ICIC'06 Proceedings of the 2006 international conference on Intelligent computing: Part II
Towards semi-supervised classification of discourse relations using feature correlations

SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Benchmarking of statistical dependency parsers for French

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Tree topological features for unlexicalized parsing

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Informed ways of improving data-driven dependency parsing for German

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
String-to-dependency statistical machine translation

Computational Linguistics
An analysis of tree topological features in classifier-based unlexicalized parsing

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Semi-supervised discourse relation classification with structural learning

CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Two methods for assessing oral reading prosody

ACM Transactions on Speech and Language Processing (TSLP)
Structured composition of semantic vectors

IWCS '11 Proceedings of the Ninth International Conference on Computational Semantics
Improving dependency parsing with semantic classes

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
A text-based decision support system for financial sequence prediction

Decision Support Systems
Probabilistic state-dependent grammars for plan recognition

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Parsing noun phrases in the penn treebank

Computational Linguistics
Corpus-Oriented grammar development for acquiring a head-driven phrase structure grammar from the penn treebank

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Out-of-the-box robust parsing of Portuguese

PROPOR'10 Proceedings of the 9th international conference on Computational Processing of the Portuguese Language
Grapheme-to-phoneme conversion based on a fast TBL algorithm in mandarin TTS systems

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
Inducing head-driven PCFGs with latent heads: refining a tree-bank grammar for parsing

ECML'05 Proceedings of the 16th European conference on Machine Learning
Unsupervised dependency parsing without gold part-of-speech tags

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Automatic image description based on textual data

Journal on Data Semantics VII
A machine learning parser using an unlexicalized distituent model

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
A sequential model for discourse segmentation

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Cognitive intentionality extraction from discourse with pragmatic-tree construction and analysis

Information Sciences: an International Journal
A comparative study of target dependency structures for statistical machine translation

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
A novel discriminative framework for sentence-level discourse analysis

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Forest reranking through subtree ranking

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Simultaneous error detection at two levels of syntactic annotation

LAW VI '12 Proceedings of the Sixth Linguistic Annotation Workshop
Syntax-aware phrase-based statistical machine translation: system description

WMT '12 Proceedings of the Seventh Workshop on Statistical Machine Translation
Parsing morphologically rich languages: Introduction to the special issue

Computational Linguistics
Automated identification of normal and diabetes heart rate signals using nonlinear measures

Computers in Biology and Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Syntactic natural language parsers have shown themselves to be inadequate for processing highly-ambiguous large-vocabulary text, as is evidenced by their poor performance on domains like the Wall Street Journal, and by the movement away from parsing-based approaches to text-processing in general. In this paper, I describe SPATTER, a statistical parser based on decision-tree learning techniques which constructs a complete parse for every sentence and achieves accuracy rates far better than any published result. This work is based on the following premises: (1) grammars are too complex and detailed to develop manually for most interesting domains; (2) parsing models must rely heavily on lexical and contextual information to analyze sentences accurately; and (3) existing n-gram modeling techniques are inadequate for parsing models. In experiments comparing SPATTER with IBM's computer manuals parser, SPATTER significantly outperforms the grammar-based parser. Evaluating SPATTER against the Penn Treebank Wall Street Journal corpus using the PARSEVAL measures, SPATTER achieves 86% precision, 86% recall, and 1.3 crossing brackets per sentence for sentences of 40 words or less, and 91% precision, 90% recall, and 0.5 crossing brackets for sentences between 10 and 20 words in length.