Sentiment classification of blog posts using topical extracts

  • Authors:
  • Zhixin Zhou;Xiuzhen Zhang;Phil Vines

  • Affiliations:
  • RMIT University, Melbourne VIC Australia;RMIT University, Melbourne VIC Australia;RMIT University, Melbourne VIC Australia

  • Venue:
  • ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Unlike news stories and product reviews which usually have a strong focus on a single topic, blog posts are often unstructured, and opinions expressed in blog posts do not necessarily correspond to a specific topic. This can lead to unsatisfactory performance of sentiment classification. In this paper we report our pilot study on addressing topic drift in blogs. We examine this phenomenon by manual inspection and extablish a ground truth. Our annotations have shown that topic drift is indeed very common, with all documents sampled showing a considerable degree of drift, averaging over 80%. The topical sentences are extracted from each post to produce an extract data set. We propose to address the topical drift problem by classifying the blog posts using the sentence-level polarities of topical extracts. We propose and evaluate two models for aggregating the sentence polarities by comparing their performance to that of a popular word-based model. Our preliminary results suggest that topical extracts can provide a concise but more accurate representation of the sentiment polarity of the blog posts. More importantly, sentence-level polarities are potentially a more reliable evidence than word distributions with regard to document polarity prediction.