Contextual word similarity and estimation from sparse data

Authors:
Ido Dagan;Shaul Marcus;Shaul Markovitch
Affiliations:
AT&T Bell Laboratories, Murray Hill, NJ;Technion, Haifa, Israel;Technion, Haifa, Israel
Venue:
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Year:
1993

Citing 13
Cited 35

Discovery procedures for sublanguage selectional patterns: initial experiments

Computational Linguistics
Full text indexing based on lexical relations an application: software libraries

SIGIR '89 Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval
Word association norms, mutual information, and lexicography

Computational Linguistics
Self-organized language modeling for speech recognition

Readings in speech recognition
Class-based n-gram models of natural language

Computational Linguistics
Introduction to the special issue on computational linguistics using large corpora

Computational Linguistics - Special issue on using large corpora: I
Two languages are more informative than one

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Structural ambiguity and lexical relations

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Word-sense disambiguation using statistical methods

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Automatically extracting and representing collocations for language generation

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Noun classification from predicate-argument structures

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Automatic processing of large corpora for the resolution of anaphora references

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3

Word sense disambiguation using a second language monolingual corpus

Computational Linguistics
Training and scaling preference functions for disambiguation

Computational Linguistics
Improving statistical language model performance with automatically generated word hierarchies

Computational Linguistics
Translating collocations for bilingual lexicons: a statistical approach

Computational Linguistics
Similarity-Based Models of Word Cooccurrence Probabilities

Machine Learning - Special issue on natural language learning
Near-synonymy and lexical choice

Computational Linguistics
Statistical Approach for Korean Analysis: A Method Based on Structural Patterns

AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Introduction to the special issue on word sense disambiguation: the state of the art

Computational Linguistics - Special issue on word sense disambiguation
Similarity-based word sense disambiguation

Computational Linguistics - Special issue on word sense disambiguation
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Generalizing case frames using a thesaurus and the MDL principle

Computational Linguistics
Unsupervised discovery of scenario-level patterns for Information Extraction

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Collocation map for overcoming data sparseness

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Word sense disambiguation in untagged text based on term weight learning

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Distributional clustering of English words

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Statistical sense disambiguation with relatively small corpora using dictionary definitions

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Similarity-based estimation of word cooccurrence probabilities

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Generalizing automatically generated selectional patterns

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Co-occurrence vectors from corpora vs. distance vectors from dictionaries

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Clustering words with the MDL principle

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Redefining similarity in a thesaurus by using corpora

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Self-organizing Chinese and Japanese semantic maps

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity

Computational Linguistics
Self-organizing semantic maps and its application to word alignment in Japanese-Chinese parallel corpora

Neural Networks - 2004 Special issue: New developments in self-organizing systems
Transforming examples into patterns for information extraction

TIPSTER '98 Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, 1998
Ontology learning: state of the art and open issues

Information Technology and Management
TopicRank: bringing insight to users

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic acquisition for sensibility knowledge using co-occurrence relation

International Journal of Computer Applications in Technology
A survey on sentiment detection of reviews

Expert Systems with Applications: An International Journal
Mining document collections to facilitate accurate approximate entity matching

Proceedings of the VLDB Endowment
Expectation vectors: a semiotics inspired approach to geometric lexical-semantic representation

GEMS '10 Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics
Detecting hedge cues and their scope in biomedical text with conditional random fields

Journal of Biomedical Informatics
Is the contextual information relevant in text clustering by compression?

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In recent years there is much interest in word cooccurrence relations, such as n-grams, verb-object combinations, or cooccurrence within a limited context. This paper discusses how to estimate the probability of cooccurrences that do not occur in the training data. We present a method that makes local analogies between each specific unobserved cooccurrence and other cooccurrences that contain similar words, as determined by an appropriate word similarity metric. Our evaluation suggests that this method performs better than existing smoothing methods, and may provide an alternative to class based models.