Getting the most out of social annotations for web page classification

  • Authors:
  • Arkaitz Zubiaga;Raquel Martínez;Víctor Fresno

  • Affiliations:
  • Universidad Nacional de Educación a Distancia, Madrid, Spain;Universidad Nacional de Educación a Distancia, Madrid, Spain;Universidad Nacional de Educación a Distancia, Madrid, Spain

  • Venue:
  • Proceedings of the 9th ACM symposium on Document engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

User-generated annotations on social bookmarking sites can provide interesting and promising metadata for web document management tasks like web page classification. These user-generated annotations include diverse types of information, such as tags and comments. Nonetheless, each kind of annotation has a different nature and popularity level. In this work, we analyze and evaluate the usefulness of each of these social annotations to classify web pages over a taxonomy like that proposed by the Open Directory Project. We compare them separately to the content-based classification, and also combine the different types of data to augment performance. Our experiments show encouraging results with the use of social annotations for this purpose, and we found that combining these metadata with web page content improves even more the classifier's performance.