Word association norms, mutual information, and lexicography
Computational Linguistics
TINLAP '75 Proceedings of the 1975 workshop on Theoretical issues in natural language processing
Empirical estimates of adaptation: the chance of two noriegas is closer to p/2 than p2
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Automatic identification of non-compositional phrases
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Feature-rich part-of-speech tagging with a cyclic dependency network
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Robust, applied morphological generation
INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14
Randomized algorithms and NLP: using locality sensitive hash function for high speed noun clustering
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A Sketch Algorithm for Estimating Two-Way and Multi-Way Associations
Computational Linguistics
Triplet lexicon models for statistical machine translation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Probabilistic counting with randomized storage
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Stream-based randomised language models for SMT
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Streaming first story detection with application to Twitter
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Online generation of locality sensitive hash signatures
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Sketching techniques for large scale NLP
WAC-6 '10 Proceedings of the NAACL HLT 2010 Sixth Web as Corpus Workshop
Template-based information extraction without the templates
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Hi-index | 0.00 |
Recent exploratory efforts in discourse-level language modeling have relied heavily on calculating Pointwise Mutual Information (PMI), which involves significant computation when done over large collections. Prior work has required aggressive pruning or independence assumptions to compute scores on large collections. We show the method of Conditional Random Sampling, thus far an underutilized technique, to be a space-efficient means of representing the sufficient statistics in discourse that underly recent PMI-based work. This is demonstrated in the context of inducing Shankian script-like structures over news articles.