A model of lexical attraction and repulsion

Authors:
Doug Beeferman;Adam Berger;John Lafferty
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Year:
1997

Citing 7
Cited 17

A Cache-Based Natural Language Model for Speech Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
A dynamic language model for speech recognition

HLT '91 Proceedings of the workshop on Speech and Natural Language
Elements of information theory

Elements of information theory
A maximum entropy approach to natural language processing

Computational Linguistics
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modelling Word-Pair Relations in a Category-Based Language Model

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Theory, Volume 1, Queueing Systems

Theory, Volume 1, Queueing Systems

Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
Word document density and relevance scoring (poster session)

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Using micro information units for internet search

Proceedings of the eleventh international conference on Information and knowledge management
Eliminating noisy information in Web pages for data mining

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Advances in domain independent linear text segmentation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Trigger-pair predictors in parsing and tagging

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A Dynamic Programming Algorithm for Linear Text Segmentation

Journal of Intelligent Information Systems
A very very large corpus doesn't always yield reliable estimates

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
On document relevance and lexical cohesion between query terms

Information Processing and Management: an International Journal
Fast computation of lexical affinity models

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A proximity language model for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Web page cleaning for web mining through feature weighting

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Legal document clustering with built-in topic segmentation

Proceedings of the 20th ACM international conference on Information and knowledge management
Spectral composition of semantic spaces

QI'11 Proceedings of the 5th international conference on Quantum interaction
Leximancer concept mapping of patient case studies

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
Basic word completion and prediction for hebrew

SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Connecting the dots: mass, energy, word meaning, and particle-wave duality

QI'12 Proceedings of the 6th international conference on Quantum Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces new methods based on exponential families for modeling the correlations between words in text and speech. While previous work assumed the effects of word co-occurrence statistics to be constant over a window of several hundred words, we show that their influence is nonstationary on a much smaller time scale. Empirical data drawn from English and Japanese text, as well as conversational speech, reveals that the "attraction" between words decays exponentially, while stylistic and syntactic contraints create a "repulsion" between words that discourages close co-occurrence. We show that these characteristics are well described by simple mixture models based on two-stage exponential distributions which can be trained using the EM algorithm. The resulting distance distributions can then be incorporated as penalizing features in an exponential language model.