A comparison of meSH terms and CiteULike social tags as metadata for the same items

  • Authors:
  • Danielle H. Lee;Titus Schleyer

  • Affiliations:
  • University of Pittsburgh, Pittsburgh, PA, USA;University of Pittsburgh, Pittsburgh, PA, USA

  • Venue:
  • Proceedings of the 1st ACM International Health Informatics Symposium
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we examine the degree of difference between two types of metadata for biomedical articles generated by different groups of people. The first type of metadata is social tags, which are assigned to articles by their readers using uncontrolled vocabulary. The second type is index terms, which are assigned by professionally trained indexers and domain experts using a controlled vocabulary. When the two kinds of metadata are assigned to the same item, we may expect that they overlap to a large extent and could substitute for one another. In this study, we compared social tags and index terms for a set of papers that appear both in CiteULike and MEDLINE, and assessed their differences. Due to the idiosyncratic nature of social tags, we preprocessed the tags through normalization, stop-word removal, stemming and spell-checking. Our results show that social tags and Medical Subject Heading (MeSH) index have little overlap and embody largely heterogeneous understanding of items.