Procedure for quantitatively comparing the syntactic coverage of English grammars
HLT '91 Proceedings of the workshop on Speech and Natural Language
The ATIS spoken language systems pilot corpus
HLT '90 Proceedings of the workshop on Speech and Natural Language
Deducing linguistic structure from the statistics of large corpora
HLT '90 Proceedings of the workshop on Speech and Natural Language
Stochastic lexicalized tree-adjoining grammars
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Stochastic tree-adjoining grammars
HLT '91 Proceedings of the workshop on Speech and Natural Language
Automatically acquiring phrase structure using distributional analysis
HLT '91 Proceedings of the workshop on Speech and Natural Language
Integrated techniques for phrase extraction from speech
HLT '94 Proceedings of the workshop on Human Language Technology
Re-estimation of lexical parameters for treebank PCFGs
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Variational inference for grammar induction with prior knowledge
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Statistical language modeling combining N-gram and context-free grammars
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Simple unsupervised grammar induction from raw text with cascaded finite state models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Hi-index | 0.00 |
The inside-outside algorithm for inferring the parameters of a stochastic context-free grammar is extended to take advantage of constituent information in a partially parsed corpus. Experiments on formal and natural language parsed corpora show that the new algorithm can achieve faster convergence and better modelling of hierarchical structure than the original one. In particular, over 90% of the constituents in the most likely analyses of a test set are compatible with test set constituents for a grammar trained on a corpus of 700 hand-parsed part-of-speech strings for ATIS sentences.