Measuring social tag confidence: is it a good or bad tag?

  • Authors:
  • Xiwu Gu;Xianbing Wang;Ruixuan Li;Kunmei Wen;Yufei Yang;Weijun Xiao

  • Affiliations:
  • Intelligent and Distributed Computing Lab, College of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, P.R. China;Intelligent and Distributed Computing Lab, College of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, P.R. China;Intelligent and Distributed Computing Lab, College of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, P.R. China;Intelligent and Distributed Computing Lab, College of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, P.R. China;Intelligent and Distributed Computing Lab, College of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, P.R. China;Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN

  • Venue:
  • WAIM'11 Proceedings of the 12th international conference on Web-age information management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Social tagging is an increasingly popular way to describe latent semantic information of web resources and thus is widely used to improve the performance of information retrieval system. However, there also has been significant variance of the quality of social tags because they can be annotated by folks on the web freely. As a consequence, how to measure the quality of social tags (referred to as social tag confidence) becomes an important issue. In this paper, we propose a statistic model to measure the confidence of social tags by utilizing a combination of three attributes of a social tag: web resource, tag, and tagging user. In order to evaluate the effectiveness of our model, two experiments are performed with datasets crawled from del.icio.us. Experimental results show that our model has a better performance than other approaches with respect to Normalized Discounted Cumulated Gain (NDCG). In addition, F-1 measure of tagged web page clustering performance is also increased when our model is applied to filter the noisy social tags with low tag confidence.