Joint training and decoding using virtual nodes for cascaded segmentation and tagging tasks

  • Authors:
  • Xian Qian;Qi Zhang;Yaqian Zhou;Xuanjing Huang;Lide Wu

  • Affiliations:
  • Fudan University, Shanghai, P.R. China;Fudan University, Shanghai, P.R. China;Fudan University, Shanghai, P.R. China;Fudan University, Shanghai, P.R. China;Fudan University, Shanghai, P.R. China

  • Venue:
  • EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Many sequence labeling tasks in NLP require solving a cascade of segmentation and tagging subtasks, such as Chinese POS tagging, named entity recognition, and so on. Traditional pipeline approaches usually suffer from error propagation. Joint training/decoding in the cross-product state space could cause too many parameters and high inference complexity. In this paper, we present a novel method which integrates graph structures of two sub-tasks into one using virtual nodes, and performs joint training and decoding in the factorized state space. Experimental evaluations on CoNLL 2000 shallow parsing data set and Fourth SIGHAN Bakeoff CTB POS tagging data set demonstrate the superiority of our method over cross-product, pipeline and candidate reranking approaches.