A comparison of discriminative classifiers for web news content extraction

  • Authors:
  • Alex Spengler;Antoine Bordes;Patrick Gallinari

  • Affiliations:
  • Université Paris, Paris, France;Université Paris, Paris, France;Université Paris, Paris, France

  • Venue:
  • RIAO '10 Adaptivity, Personalization and Fusion of Heterogeneous Information
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Until now, approaches to web content extraction have focused on random field models, largely neglecting large margin methods. Structured large margin methods, however, have recently shown great practical success. We compare, for the first time, greedy and structured support vector machines with conditional random fields on a real-world web news content extraction task, showing that large margin approaches are indeed competitive with random field models.