A fact/opinion classifier for news articles

  • Authors:
  • Adam Stepinski;Vibhu Mittal

  • Affiliations:
  • Rice University, Houston, TX;Google, Mountain View, CA

  • Venue:
  • SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many online news/blog aggregators like Google, Yahoo and MSN allow users to browse/search many hundreds of news sources. This results in dozens, often hundreds, of stories about the same event. While the news aggregators cluster these stories, allowing the user to efficiently scan the major news items at any given time, they do not currently allow alternative browsing mechanisms within the clusters. Furthermore, their intra-cluster ranking mechanisms are often based on a notion of authority/popularity of the source. In many cases, this leads to the classic power law phenomenon -- the popular stories/sources are the ones that are already popular/authoritative, thus reinforcing one dominant viewpoint. Ideally, these aggregators would exploit the availability of the tremendous number of sources to identify the various dominant threads or viewpoints about a story and highlight these threads for the users. This paper presents an initial limited approach to such an interface: it classifies articles into two categories: fact and opinion. We show that the combination of (i) a classifier trained on a small (140K) training set of editorials/reports and (ii) an interactive user interface that ameliorates classification errors by re-ordering the presentation can be effective in highlighting different underlying viewpoints in a story-cluster. We briefly discuss the classifier used here, the training set and the UI and report on some initial anecdotal user feedback and evaluation.