Effect of word density on measuring words association

  • Authors:
  • Sanasam Ranbir Singh;Hema A. Murthy;Timothy A. Gonsalves

  • Affiliations:
  • Indian Institute of Technology Madras, Chennai, India;Indian Institute of Technology Madras, Chennai, India;Indian Institute of Technology Madras, Chennai, India

  • Venue:
  • COMPUTE '08 Proceedings of the 1st Bangalore Annual Compute Conference
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The study of mining the associated words is not new. Because of its wide ranges of applications, it is still an important issue in Information Retrieval. The existing estimators such as joint probability, words association norm do not consider the density of the words present in each window. In this paper, we incorporate the word density and propose estimator based on word density to measure the association between the words. From various experimental results based on the human judgments and precision collected from search engines, we find that the precision of the estimators could be improved by incorporating word density. For all ranges of the size of the windows, our estimator outperforms all other estimators. We also observe that all these estimators (both existing and proposed one) perform relatively better when the windows contain around five sentences. We also show by using Spearman rank-order correlation coefficient that our estimator returns better quality of the ranking of the associated terms.