Two methods of evaluation of semantic similarity of nouns based on their modifier sets

Authors:
Igor A. Bolshakov;Alexander Gelbukh
Affiliations:
Center for Computing Research, National Polytechnic Institute, Mexico City, Mexico;Center for Computing Research, National Polytechnic Institute, Mexico City, Mexico
Venue:
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Year:
2007

Citing 7
Cited 1

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Using the web to obtain frequencies for unseen bigrams

Computational Linguistics - Special issue on web as corpus
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Correcting real-word spelling errors by restoring lexical cohesion

Natural Language Engineering
Finding predominant word senses in untagged text

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Using measures of semantic relatedness for word sense disambiguation

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Measurements of lexico-syntactic cohesion by means of internet

MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence

Distribution-based semantic similarity of nouns

CIARP'07 Proceedings of the Congress on pattern recognition 12th Iberoamerican conference on Progress in pattern recognition, image analysis and applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Two methods of evaluation of semantic similarity/dissimilarity of English nouns are proposed based on their modifier sets taken from Oxford Collocation Dictionary for Student of English. The first method measures similarity by the portion of modifiers commonly applicable to both nouns under evaluation. The second method measures dissimilarity by the change of the mean value of cohesion between a noun and modifiers, its own or those of the contrasted noun. Cohesion between words is measured by Stable Connection Index (SCI) based of raw Web statistics for occurrences and co-occurrences of words. It is shown that the two proposed measures are approximately in inverse monotonic dependency, while the Web evaluations confer a higher resolution.