Subjectivity annotation of the microblog 2011 realtime adhoc relevance judgments

  • Authors:
  • Georgios Paltoglou;Kevan Buckley

  • Affiliations:
  • School of Technology, University of Wolverhampton, Wolverhampton, UK;School of Technology, University of Wolverhampton, Wolverhampton, UK

  • Venue:
  • ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work, we extend the Microblog dataset with subjectivity annotations. Our aim is twofold; first, we want to provide a high-quality, multiply-annotated gold standard of subjectivity annotations for the relevance assessments of the real-time adhoc task. Second, we randomly sample the rest of the dataset and annotate it for subjectivity once, in order to create a complementary annotated dataset that is at least an order of magnitude larger than the gold standard. As a result we have 2,389 tweets that have been annotated by multiple humans and 75,761 tweets that have been annotated by one annotator. We discuss issues like inter-annotator agreement, the time that it took annotators to classify tweets in correlation to their subjective content and lastly, the distribution of subjective tweets in relation to topic categorization. The annotated datasets and all relevant anonymised information are freely available for research purposes.