A weighted tag similarity measure based on a collaborative weight model

Authors:
Gokavarapu Srinivas;Niket Tandon;Vasudeva Varma
Affiliations:
International Institute of Information Technology, Hyderabad, India;Max Planck Institute, Saarbrücken, Germany;International Institute of Information Technology, Hyderabad, India
Venue:
SMUC '10 Proceedings of the 2nd international workshop on Search and mining user-generated contents
Year:
2010

Citing 9
Cited 3

An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Usage patterns of collaborative tagging systems

Journal of Information Science
Evaluating WordNet-based Measures of Lexical Semantic Relatedness

Computational Linguistics
Optimizing web search using social annotations

Proceedings of the 16th international conference on World Wide Web
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems

ISWC '08 Proceedings of the 7th International Conference on The Semantic Web
Evaluating similarity measures for emergent semantics of social tagging

Proceedings of the 18th international conference on World wide web
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Information retrieval in folksonomies: search and ranking

ESWC'06 Proceedings of the 3rd European conference on The Semantic Web: research and applications
Emergent semantics from folksonomies: a quantitative study

Journal on Data Semantics VI

Mining tag similarity in folksonomies

Proceedings of the 3rd international workshop on Search and mining user-generated contents
Finding keywords in blogs: Efficient keyword extraction in blog mining via user behaviors

Expert Systems with Applications: An International Journal
Mining influential bloggers: From general to domain specific, from explicit to implicit

International Journal of Knowledge-based and Intelligent Engineering Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of measuring semantic relatedness between social tags remains largely open. Given the structure of social bookmarking systems, similarity measures need to be addressed from a social bookmarking systems perspective. We address the fundamental problem of weight model for tags over which every similarity measure is based. We propose a weight model for tagging systems that considers the user dimension unlike existing measures based on tag frequency. Visual analysis of tag clouds depicts that the proposed model provides intuitively better scores for weights than tag frequency. We also propose weighted similarity model that is conceptually different from the contemporary frequency based similarity measures. Based on the weighted similarity model, we present weighted variations of several existing measures like Dice and Cosine similarity measures. We evaluate the proposed similarity model using Spearman's correlation coefficient, with WordNet as the gold standard. Our method achieves 20% improvement over the traditional similarity measures like dice and cosine similarity and also over the most recent tag similarity measures like mutual information with distributional aggregation. Finally, we show the practical effectiveness of the proposed weighted similarity measures by performing search over tagged documents using Social SimRank over a large real world dataset.