Semantic Characterization of Tweets Using Topic Models: A Use Case in the Entertainment Domain

  • Authors:
  • Andrés García-Silva;Víctor Rodríguez-Doncel;Oscar Corch

  • Affiliations:
  • Ontology Engineering Group, Universidad Politécnica de Madrid, Madrid, Spain;Ontology Engineering Group, Universidad Politécnica de Madrid, Madrid, Spain;Ontology Engineering Group, Universidad Politécnica de Madrid, Madrid, Spain

  • Venue:
  • International Journal on Semantic Web & Information Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the entertainment domain users tweet about their expectations and opinions regarding upcoming, current and past experiences, while companies advertise and promote the shows. This characterization, important for customers and companies, goes beyond traditional sentiment analysis where the polarity of the sentiments expressed in opinions is usually identified as positive, negative or neutral. The authors investigate different tweet representation models, including bags of words and probabilistic topic models, to shed light on the semantics of the messages. Their experiments show that topic-based models generated with Latent Dirichlet Allocation LDA yield, most of the times, better categorizations when compared to TF-IDF based features, particularly when these models are enriched with natural language features and specific Twitter slang.