An analysis of the use of tags in a blog recommender system

  • Authors:
  • Conor Hayes;Paolo Avesani;Sriharsha Veeramachaneni

  • Affiliations:
  • ITC-IRST, Povo, Trento, Italy;ITC-IRST, Povo, Trento, Italy;ITC-IRST, Povo, Trento, Italy

  • Venue:
  • IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Web is experiencing an exponential growth in the use of weblogs or blogs, websites containing dated journal-style entries. Blog entries are generally organised using informally defined labels known as tags. Increasingly, tags are being proposed as a 'grassroots' alternative to Semantic Web standards. We demonstrate that tags by themselves are weak at partitioning blog data. We then show how tags may contribute useful, discriminating information. Using content-based clustering, we observe that frequently occurring tags in each cluster are usually good meta-labels for the cluster concept. We then introduce the Tr score, a score based on the proportion of high-frequency tags in a cluster, and demonstrate that it is strongly correlated with cluster strength. We demonstrate how the Tr score enables the detection and removal of weak clusters. As such, the Tr score can be used as an independent means of verifying topic integrity in a cluster-based recommender system.