Procedure for quantitatively comparing the syntactic coverage of English grammars
HLT '91 Proceedings of the workshop on Speech and Natural Language
The ATIS spoken language systems pilot corpus
HLT '90 Proceedings of the workshop on Speech and Natural Language
Generating a grammar for statistical training
HLT '90 Proceedings of the workshop on Speech and Natural Language
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Parsing the Wall Street Journal with the inside-outside algorithm
EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
Inside-outside reestimation from partially bracketed corpora
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
A stochastic approach to parsing
COLING '86 Proceedings of the 11th coference on Computational linguistics
Automatically acquiring phrase structure using distributional analysis
HLT '91 Proceedings of the workshop on Speech and Natural Language
Dialogue act modeling for automatic tagging and recognition of conversational speech
Computational Linguistics
Distributional phrase structure induction
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Prosodic words prediction from lexicon words with CRF and TBL joint method
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Transforming trees to improve syntactic convergence
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
In this paper we describe a new technique for parsing free text: a transformational grammar is automatically learned that is capable of accurately parsing text into binary-branching syntactic trees with nonterminals unlabelled. The algorithm works by beginning in a very naive state of knowledge about phrase structure. By repeatedly comparing the results of bracketing in the current state to proper bracketing provided in the training corpus, the system learns a set of simple structural transformations that can be applied to reduce error. After describing the algorithm, we present results and compare these results to other recent results in automatic grammar induction.