A mathematical theory of communication
ACM SIGMOBILE Mobile Computing and Communications Review
A neural probabilistic language model
The Journal of Machine Learning Research
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Monte Carlo Statistical Methods (Springer Texts in Statistics)
A hierarchical Bayesian language model based on Pitman-Yor processes
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A new ppm variant for chinese text compression
Natural Language Engineering
A stochastic memoizer for sequence data
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Lossless Compression Based on the Sequence Memoizer
DCC '10 Proceedings of the 2010 Data Compression Conference
The context-tree weighting method: extensions
IEEE Transactions on Information Theory
A bayesian model for learning SCFGs with discontiguous rules
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Margin-maximizing classification of sequential data with infinitely-long temporal dependencies
Expert Systems with Applications: An International Journal
Hi-index | 48.22 |
Probabilistic models of sequences play a central role in most machine translation, automated speech recognition, lossless compression, spell-checking, and gene identification applications to name but a few. Unfortunately, real-world sequence data often exhibit long range dependencies which can only be captured by computationally challenging, complex models. Sequence data arising from natural processes also often exhibits power-law properties, yet common sequence models do not capture such properties. The sequence memoizer is a new hierarchical Bayesian model for discrete sequence data that captures long range dependencies and power-law characteristics, while remaining computationally attractive. Its utility as a language model and general purpose lossless compressor is demonstrated.