Using uneven margins SVM and perceptron for information extraction

  • Authors:
  • Yaoyong Li;Kalina Bontcheva;Hamish Cunningham

  • Affiliations:
  • The University of Sheffield, Sheffield, UK;The University of Sheffield, Sheffield, UK;The University of Sheffield, Sheffield, UK

  • Venue:
  • CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The classification problem derived from information extraction (IE) has an imbalanced training set. This is particularly true when learning from smaller datasets which often have a few positive training examples and many negative ones. This paper takes two popular IE algorithms -- SVM and Perceptron -- and demonstrates how the introduction of an uneven margins parameter can improve the results on imbalanced training data in IE. Our experiments demonstrate that the uneven margin was indeed helpful, especially when learning from few examples. Essentially, the smaller the training set is, the more beneficial the uneven margin can be. We also compare our systems to other state-of-the-art algorithms on several benchmarking corpora for IE.