Neural networks letter: Training the max-margin sequence model with the relaxed slack variables

  • Authors:
  • Lingfeng Niu;Jianmin Wu;Yong Shi

  • Affiliations:
  • Research Center on Fictitious Economy & Data Science, Chinese Academy of Sciences, Beijing, 100190, China;Yahoo! Research & Development (Beijing), Tsinghua Science Park, Bejing, 100084, China;Research Center on Fictitious Economy & Data Science, Chinese Academy of Sciences, Beijing, 100190, China and College of Information Science and Technology, University of Nebraska at Omaha, Omaha, ...

  • Venue:
  • Neural Networks
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sequence models are widely used in many applications such as natural language processing, information extraction and optical character recognition, etc. We propose a new approach to train the max-margin based sequence model by relaxing the slack variables in this paper. With the canonical feature mapping definition, the relaxed problem is solved by training a multiclass Support Vector Machine (SVM). Compared with the state-of-the-art solutions for the sequence learning, the new method has the following advantages: firstly, the sequence training problem is transformed into a multiclassification problem, which is more widely studied and already has quite a few off-the-shelf training packages; secondly, this new approach reduces the complexity of training significantly and achieves comparable prediction performance compared with the existing sequence models; thirdly, when the size of training data is limited, by assigning different slack variables to different microlabel pairs, the new method can use the discriminative information more frugally and produces more reliable model; last but not least, by employing kernels in the intermediate multiclass SVM, nonlinear feature space can be easily explored. Experimental results on the task of named entity recognition, information extraction and handwritten letter recognition with the public datasets illustrate the efficiency and effectiveness of our method.