Approximate counting: a detailed analysis
BIT - Ellis Horwood series in artificial intelligence
Counting large numbers of events in small registers
Communications of the ACM
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Approximate frequency counts over data streams
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Smoothing a tera-word language model
HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Streaming for large scale NLP: language modeling
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Stream-based randomised language models for SMT
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Online generation of locality sensitive hash signatures
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Sketching techniques for large scale NLP
WAC-6 '10 Proceedings of the NAACL HLT 2010 Sixth Web as Corpus Workshop
Sketch techniques for scaling distributional similarity to the web
GEMS '10 Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics
Proceedings of the 15th Symposium on International Database Engineering & Applications
Approximate scalable bounded space sketch for large data NLP
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Space efficiencies in discourse modeling via conditional random sampling
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Streaming analysis of discourse participants
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Fast large-scale approximate graph construction for NLP
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Sketch algorithms for estimating point queries in NLP
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
Previous work by Talbot and Osborne [2007] explored the use of randomized storage mechanisms in language modeling. These structures trade a small amount of error for significant space savings, enabling the use of larger language models on relatively modest hardware. Going beyond space efficient count storage, here we present the Talbot Osborne Morris Bloom (TOMB) Counter, an extended model for performing space efficient counting over streams of finite length. Theoretical and experimental results are given, showing the promise of approximate counting over large vocabularies in the context of limited space.