A dual-layer CRFs based joint decoding method for cascaded segmentation and labeling tasks

  • Authors:
  • Yanxin Shi;Mengqiu Wang

  • Affiliations:
  • Language Technologies Institute, School of Computer Science, Carnegie Mellon University;Language Technologies Institute, School of Computer Science, Carnegie Mellon University

  • Venue:
  • IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many problems in NLP require solving a cascade of subtasks. Traditional pipeline approaches yield to error propagation and prohibit joint training/ decoding between subtasks. Existing solutions to this problem do not guarantee nonviolation of hard-constraints imposed by subtasks and thus give rise to inconsistent results, especially in cases where segmentation task precedes labeling task. We present a method that performs joint decoding of separately trained Conditional Random Field (CRF) models, while guarding against violations of hard-constraints. Evaluated on Chinese word segmentation and part-of-speech (POS) tagging tasks, our proposed method achieved state-of-the-art performance on both the Penn Chinese Treebank and First SIGHAN Bakeoff datasets. On both segmentation and POS tagging tasks, the proposed method consistently improves over baseline methods that do not perform joint decoding.