Folksonomy-Based Term Extraction for Word Cloud Generation

  • Authors:
  • David Carmel;Erel Uziel;Ido Guy;Yosi Mass;Haggai Roitman

  • Affiliations:
  • IBM Research - Haifa Lab;IBM Research - Haifa Lab;IBM Research - Haifa Lab;IBM Research - Haifa Lab;IBM Research - Haifa Lab

  • Venue:
  • ACM Transactions on Intelligent Systems and Technology (TIST)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work we study the task of term extraction for word cloud generation in sparsely tagged domains, in which manual tags are scarce. We present a folksonomy-based term extraction method, called tag-boost, which boosts terms that are frequently used by the public to tag content. Our experiments with tag-boost based term extraction over different domains demonstrate tremendous improvement in word cloud quality, as reflected by the agreement between manual tags of the testing items and the cloud’s terms extracted from the items’ content. Moreover, our results demonstrate the high robustness of this approach, as compared to alternative cloud generation methods that exhibit a high sensitivity to data sparseness. Additionally, we show that tag-boost can be effectively applied even in nontagged domains, by using an external rich folksonomy borrowed from a well-tagged domain.