Improving categorisation in social media using hyperlinks to structured data sources

  • Authors:
  • Sheila Kinsella;Mengjiao Wang;John G. Breslin;Conor Hayes

  • Affiliations:
  • Digital Enterprise Research Institute, National University of Ireland, Galway;Digital Enterprise Research Institute, National University of Ireland, Galway;Digital Enterprise Research Institute, National University of Ireland, Galway and School of Engineering and Informatics, National University of Ireland, Galway;Digital Enterprise Research Institute, National University of Ireland, Galway

  • Venue:
  • ESWC'11 Proceedings of the 8th extended semantic web conference on The semanic web: research and applications - Volume Part II
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Social media presents unique challenges for topic classification, including the brevity of posts, the informal nature of conversations, and the frequent reliance on external hyperlinks to give context to a conversation. In this paper we investigate the usefulness of these external hyperlinks for categorising the topic of individual posts. We focus our analysis on objects that have related metadata available on the Web, either via APIs or as Linked Data. Our experiments show that the inclusion of metadata from hyperlinked objects in addition to the original post content significantly improved classifier performance on two disparate datasets. We found that including selected metadata from APIs and Linked Data gave better results than including text from HTML pages. We investigate how this improvement varies across different topics. We also make use of the structure of the data to compare the usefulness of different types of external metadata for topic classification in a social media dataset.