A robust and hybrid deep-linguistic theory applied to large-scale parsing

Authors:
Gerold Schneider;James Dowdall;Fabio Rinaldi
Affiliations:
University of Zurich;University of Zurich;University of Zurich
Venue:
ROMAND '04 Proceedings of the 3rd Workshop on RObust Methods in Analysis of Natural Language Data
Year:
2004

Citing 24
Cited 0

Multiword Expressions: A Pain in the Neck for NLP

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Robust Processing of Natural Language

KI '95 Proceedings of the 19th Annual German Conference on Artificial Intelligence: Advances in Artificial Intelligence
Data-Oriented Parsing

Data-Oriented Parsing
Tree-bank Grammars

Tree-bank Grammars
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Parsing engineering and empirical robustness

Natural Language Engineering
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
A non-projective dependency parser

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
The complexity of recognition of linguistically adequate dependency grammars

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A new statistical parser based on bigram lexical dependencies

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Using grammatical relations to compare parsers

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
A simple pattern-matching algorithm for recovering empty nodes and their antecedents

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Parsing the wall street journal using a Lexical-Functional Grammar and discriminative estimation techniques

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Generative models for statistical parsing with Combinatory Categorial Grammar

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A low-complexity, broad-coverage probabilistic dependency parser for English

NAACLstudent '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Proceedings of the HLT-NAACL 2003 student research workshop - Volume 3
Inducing history representations for broad coverage statistical parsing

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Finding non-local dependencies: beyond pattern matching

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Complex structuring of term variants for Question Answering

MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Antecedent recovery: experiments with a trace tagger

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Long-distance dependency resolution in automatically acquired wide-coverage PCFG-based LFG approximations

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A dependency-based method for evaluating broad-coverage parsers

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Some experiments on indicators of parsing complexity for lexicalized grammars

Proceedings of the COLING-2000 Workshop on Efficiency In Large-Scale Parsing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern statistical parsers are robust and quite fast, but their output is relatively shallow when compared to formal grammar parsers. We suggest to extend statistical approaches to a more deep-linguistic analysis while at the same time keeping the speed and low complexity of a statistical parser. The resulting parsing architecture suggested, implemented and evaluated here is highly robust and hybrid on a number of levels, combining statistical and rule-based approaches, constituency and dependency grammar, shallow and deep processing, full and near-full parsing. With its parsing speed of about 300,000 words per hour and state-of-the-art performance the parser is reliable for a number of large-scale applications discussed in the article.