TagScore: Approximate Similarity Using Tag Synopses

Authors:
Alex Penev;Raymond K. Wong
Affiliations:
-;-
Venue:
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Year:
2008

Citing 10
Cited 1

Fast Approximate Similarity Search in Extremely High-Dimensional Data Sets

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Improved annotation of the blogosphere via autotagging and hierarchical clustering

Proceedings of the 15th international conference on World Wide Web
HT06, tagging paper, taxonomy, Flickr, academic article, to read

Proceedings of the seventeenth conference on Hypertext and hypermedia
P-TAG: large scale automatic generation of personalized annotation tags for the web

Proceedings of the 16th international conference on World Wide Web
Combating spam in tagging systems

AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Authors vs. readers: a comparative study of document metadata and content in the www

Proceedings of the 2007 ACM symposium on Document engineering
Can social bookmarking improve web search?

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Finding similar pages in a social tagging repository

Proceedings of the 17th international conference on World Wide Web
Social tag prediction

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Ontologies are us: a unified model of social networks and semantics

ISWC'05 Proceedings of the 4th international conference on The Semantic Web

Framework for timely and accurate ads on mobile devices

Proceedings of the 18th ACM conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Collaborative tagging is the aggregate effort by a community of online users to annotate web content with metadata labels called tags. It is a simple activity that enriches our knowledge about digital content, and has gained popularity with services such as Del.icio.us. Del.icio.us has a large repository that evolves daily, presenting interesting new problems for IR. We present TagScore, a scoring function to rate the goodness of Del.icio.us tags for their associated web page. It gives us a succinct synopsis for a page that we can use to efficiently find similar pages. Using real Del.icio.us data, we show that our approach gives good correlation to cosine similarity but is several hundred times faster and requires minimal storage overhead.