Alternative approaches for generating bodies of grammar rules

Authors:
Gabriel Infante-Lopez;Maarten de Rijke
Affiliations:
University of Amsterdam;University of Amsterdam
Venue:
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Year:
2004

Citing 17
Cited 4

Elements of information theory

Elements of information theory
Learning Regular Languages from Simple Positive Examples

Machine Learning
Using Symbol Clustering to Improve Probabilistic Automaton Inference

ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning Stochastic Regular Grammars by Means of a State Merging Method

ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Discriminative Reranking for Natural Language Parsing

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
W-grammar

ACM '69 Proceedings of the 1969 24th national conference
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Compacting the Penn Treebank grammar

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A new statistical parser based on bigram lexical dependencies

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Three new probabilistic models for dependency parsing: an exploration

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Relating probabilistic grammars and automata

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Tree-gram parsing lexical dependencies and structural relations

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
A dependency-based method for evaluating broad-coverage parsers

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Statistical parsing with a context-free grammar and word statistics

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Searching for Part of Speech Tags That Improve Parsing Models

GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Structural analysis of regulatory DNA sequences using grammar inference and Support Vector Machine

Neurocomputing
Sequences of part of speech tags vs. sequences of phrase labels: how do they help in parsing?

CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Spectral learning for non-deterministic dependency parsing

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We compare two approaches for describing and generating bodies of rules used for natural language parsing. In today's parsers rule bodies do not exist a priori but are generated on the fly, usually with methods based on n-grams, which are one particular way of inducing probabilistic regular languages. We compare two approaches for inducing such languages. One is based on n-grams, the other on minimization of the Kullback-Leibler divergence. The inferred regular languages are used for generating bodies of rules inside a parsing procedure. We compare the two approaches along two dimensions: the quality of the probabilistic regular language they produce, and the performance of the parser they were used to build. The second approach outperforms the first one along both dimensions.