Information Processing Letters
Elements of information theory
Elements of information theory
Randomized Distributed Edge Coloring via an Extension of the Chernoff--Hoeffding Bounds
SIAM Journal on Computing
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
On the Convergence Rate of Good-Turing Estimators
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Exploiting syntactic structure for language modeling
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A bibliographical study of grammatical inference
Pattern Recognition
Hi-index | 0.00 |
We consider some of our recent work on Good-Turing estimators in the larger context of learning theory and language modeling. The Good-Turing estimators have played a significant role in natural language modeling for the past 20 years. We have recently shown that these particular leave-one-out estimators converge rapidly. We present these results and consider possible consequences for language modeling in general. In particular, other leave-one-out estimators, such as for the cross-entropy of various forms of language models, might also be shown to be rapidly converging using proof methods similar to those used for the Good-Turing estimators. This could have broad ramifications in the analysis and development of language modeling methods. We suggest that, in language modeling at least, leave-one-out estimation may be more significant than Occam's razor.