A Cache-Based Natural Language Model for Speech Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Probabilistic models of short and long distance word dependencies in running text
HLT '89 Proceedings of the workshop on Speech and Natural Language
Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
A model of lexical attraction and repulsion
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Statistical decision-tree models for parsing
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
A new statistical parser based on bigram lexical dependencies
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Decision tree parsing using a hidden derivation model
HLT '94 Proceedings of the workshop on Human Language Technology
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Hi-index | 0.00 |
In this article, we apply to natural language parsing and tagging the device of trigger-pair predictors, previously employed exclusively within the field of language modelling for speech recognition. Given the task of predicting the correct rule to associate with a parse-tree node, or the correct tag to associate with a word of text, and assuming a particular class of parsing or tagging model, we quantify the information gain realized by taking account of rule or tag trigger-pair predictors, i.e. pairs consisting of a "triggering" rule or tag which has already occurred in the document being processed, together with a specific "triggered" rule or tag whose probability of occurrence within the current sentence we wish to estimate. This information gain is shown to be substantial. Further, by utilizing trigger pairs taken from the same general sort of document as is being processed (e.g. same subject matter or same discourse type)---as opposed to predictors derived from a comprehensive general set of English texts---we can significantly increase this information gain.