Efficient learning of context-free grammars from positive structural examples
Information and Computation
Class-based n-gram models of natural language
Computational Linguistics
Bayesian learning of probabilistic language models
Bayesian learning of probabilistic language models
Building probabilistic models for natural language
Building probabilistic models for natural language
IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical methods for speech recognition
Statistical methods for speech recognition
ICGI '00 Proceedings of the 5th International Colloquium on Grammatical Inference: Algorithms and Applications
SSPR '96 Proceedings of the 6th International Workshop on Advances in Structural and Syntactical Pattern Recognition
Computation of the probability of initial substring generation by stochastic context-free grammars
Computational Linguistics
Probabilistic top-down parsing and language modeling
Computational Linguistics
Statistical properties of probabilistic context-free grammars
Computational Linguistics
Inside-outside reestimation from partially bracketed corpora
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Combination of n-grams and Stochastic Context-Free Grammars for language modeling
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Immediate-head parsing for language models
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Applying Probability Measures to Abstract Languages
IEEE Transactions on Computers
A Maximum Likelihood Approach to Continuous Speech Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast Stochastic Context-Free Parsing: A Stochastic Version of the Valiant Algorithm
IbPRIA '07 Proceedings of the 3rd Iberian conference on Pattern Recognition and Image Analysis, Part I
Word Segments in Category-Based Language Models for Automatic Speech Recognition
IbPRIA '07 Proceedings of the 3rd Iberian conference on Pattern Recognition and Image Analysis, Part I
IbPRIA '07 Proceedings of the 3rd Iberian conference on Pattern Recognition and Image Analysis, Part I
Semiring Lattice Parsing Applied to CYK
IbPRIA '07 Proceedings of the 3rd Iberian conference on Pattern Recognition and Image Analysis, Part I
Segment-based classes for language modeling within the field of CSR
CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications
Using finite state models for the integration of hierarchical LMs into ASR systems
MCPR'11 Proceedings of the Third Mexican conference on Pattern recognition
Time reduction of stochastic parsing with stochastic context-free grammars
IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
Performance of a SCFG-based language model with training data sets of increasing size
IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
A hybrid approach to statistical language modeling with multilayer perceptrons and unigrams
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
A scalable distributed syntactic, semantic, and lexical language model
Computational Linguistics
Hi-index | 0.00 |
This paper is devoted to the estimation of stochastic context-free grammars (SCFGs) and their use as language models. Classical estimation algorithms, together with new ones that consider a certain subset of derivations in the estimation process, are presented in a unified framework. This set of derivations is chosen according to both structural and statistical criteria. The estimated SCFGs have been used in a new hybrid language model to combine both a word-based n-gram, which is used to capture the local relations between words, and a category-based SCFG together with a word distribution into categories, which is defined to represent the long-term relations between these categories. We describe methods for learning these stochastic models for complex tasks, and we present an algorithm for computing the word transition probability using this hybrid language model. Finally, experiments on the UPenn Treebank corpus show significant improvements in the test set perplexity with regard to the classical word trigram models.