A weighted tag similarity measure based on a collaborative weight model

  • Authors:
  • Gokavarapu Srinivas;Niket Tandon;Vasudeva Varma

  • Affiliations:
  • International Institute of Information Technology, Hyderabad, India;Max Planck Institute, Saarbrücken, Germany;International Institute of Information Technology, Hyderabad, India

  • Venue:
  • SMUC '10 Proceedings of the 2nd international workshop on Search and mining user-generated contents
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of measuring semantic relatedness between social tags remains largely open. Given the structure of social bookmarking systems, similarity measures need to be addressed from a social bookmarking systems perspective. We address the fundamental problem of weight model for tags over which every similarity measure is based. We propose a weight model for tagging systems that considers the user dimension unlike existing measures based on tag frequency. Visual analysis of tag clouds depicts that the proposed model provides intuitively better scores for weights than tag frequency. We also propose weighted similarity model that is conceptually different from the contemporary frequency based similarity measures. Based on the weighted similarity model, we present weighted variations of several existing measures like Dice and Cosine similarity measures. We evaluate the proposed similarity model using Spearman's correlation coefficient, with WordNet as the gold standard. Our method achieves 20% improvement over the traditional similarity measures like dice and cosine similarity and also over the most recent tag similarity measures like mutual information with distributional aggregation. Finally, we show the practical effectiveness of the proposed weighted similarity measures by performing search over tagged documents using Social SimRank over a large real world dataset.