Tags in domain-specific sites: new information?

  • Authors:
  • Jeremy Steinhauer;Lois M.L. Delcambre;David Maier;Marianne Lykke;Vu H. Tran

  • Affiliations:
  • Portland State University, Portland, OR, USA;Portland State University, Portland, OR, USA;Portland State University, Portland, OR, USA;Aalborg University, Aalborg, Denmark;Portland State University, Portland, OR, USA

  • Venue:
  • Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

If researchers use tags in retrieval applications they might assume, implicitly, that tags represent novel information, e.g., when they attribute performance improvement in their retrieval algorithm(s) to the use of tags. In this work, we investigate whether this assumption is true. We focus on the use of tags in domain-specific websites because such websites are more likely to have a coherent, discernible website structure and because the users that are searching for and tagging pages in such a site may have specific information needs (as opposed to the broad range of information needs that users have when browsing/searching the Internet at large). For this study, we assume that the application of the same tag to multiple pages provides an indication that those pages are related. To determine whether this indication of relatedness is contributing new information, we first measure whether pages with common tag(s) could have been deemed as related based on site structure as measured by shortest navigational distance between pages. Second, we measure whether or not tags could have been determined algorithmically based on standard tf-idf scores of terms on the page. Based on our analysis of two different sites, we found that tags contribute novel information that is not discernible from site structure or site/page content.