Similarity-Based Models of Word Cooccurrence Probabilities
Machine Learning - Special issue on natural language learning
The impact of corpus size on question answering performance
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Query Expansion with Long-Span Collocates
Information Retrieval
Modelling Word-Pair Relations in a Category-Based Language Model
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Discovery of linguistic relations using lexical attraction
Discovery of linguistic relations using lexical attraction
Using the web to obtain frequencies for unseen bigrams
Computational Linguistics - Special issue on web as corpus
A model of lexical attraction and repulsion
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
The computation of word associations: comparing syntagmatic and paradigmatic approaches
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Using collocations for topic segmentation and link detection
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Measuring the similarity between compound nouns in different languages using non-parallel corpora
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A measure of term representativeness based on the number of co-occurring salient words
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Frequency estimates for statistical word similarity measures
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A general framework for distributional similarity
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Scoring missing terms in information retrieval tasks
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Co-dispersion: a windowless approach to lexical association
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Expectation vectors: a semiotics inspired approach to geometric lexical-semantic representation
GEMS '10 Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics
Hi-index | 0.00 |
We present a framework for the fast computation of lexical affinity models. The framework is composed of a novel algorithm to efficiently compute the co-occurrence distribution between pairs of terms, an independence model, and a parametric affinity model. In comparison with previous models, which either use arbitrary windows to compute similarity between words or use lexical affinity to create sequential models, in this paper we focus on models intended to capture the co-occurrence patterns of any pair of words or phrases at any distance in the corpus. The framework is flexible, allowing fast adaptation to applications and it is scalable. We apply it in combination with a terabyte corpus to answer natural language tests, achieving encouraging results.