An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Usage patterns of collaborative tagging systems
Journal of Information Science
Evaluating WordNet-based Measures of Lexical Semantic Relatedness
Computational Linguistics
Optimizing web search using social annotations
Proceedings of the 16th international conference on World Wide Web
Semantic Grounding of Tag Relatedness in Social Bookmarking Systems
ISWC '08 Proceedings of the 7th International Conference on The Semantic Web
Evaluating similarity measures for emergent semantics of social tagging
Proceedings of the 18th international conference on World wide web
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Information retrieval in folksonomies: search and ranking
ESWC'06 Proceedings of the 3rd European conference on The Semantic Web: research and applications
Emergent semantics from folksonomies: a quantitative study
Journal on Data Semantics VI
Mining tag similarity in folksonomies
Proceedings of the 3rd international workshop on Search and mining user-generated contents
Finding keywords in blogs: Efficient keyword extraction in blog mining via user behaviors
Expert Systems with Applications: An International Journal
Mining influential bloggers: From general to domain specific, from explicit to implicit
International Journal of Knowledge-based and Intelligent Engineering Systems
Hi-index | 0.00 |
The problem of measuring semantic relatedness between social tags remains largely open. Given the structure of social bookmarking systems, similarity measures need to be addressed from a social bookmarking systems perspective. We address the fundamental problem of weight model for tags over which every similarity measure is based. We propose a weight model for tagging systems that considers the user dimension unlike existing measures based on tag frequency. Visual analysis of tag clouds depicts that the proposed model provides intuitively better scores for weights than tag frequency. We also propose weighted similarity model that is conceptually different from the contemporary frequency based similarity measures. Based on the weighted similarity model, we present weighted variations of several existing measures like Dice and Cosine similarity measures. We evaluate the proposed similarity model using Spearman's correlation coefficient, with WordNet as the gold standard. Our method achieves 20% improvement over the traditional similarity measures like dice and cosine similarity and also over the most recent tag similarity measures like mutual information with distributional aggregation. Finally, we show the practical effectiveness of the proposed weighted similarity measures by performing search over tagged documents using Social SimRank over a large real world dataset.