Scaling conditional random fields by one-against-the-other decomposition

  • Authors:
  • Hai Zhao;Chunyu Kit

  • Affiliations:
  • Department of Chinese, Translation and Linguistics, City University of Hong Kong, Kowloon, Hong Kong, China;Department of Chinese, Translation and Linguistics, City University of Hong Kong, Kowloon, Hong Kong, China

  • Venue:
  • Journal of Computer Science and Technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

As a powerful sequence labeling model, conditional random fields (CRFs) have had successful applications in many natural language processing (NLP) tasks. However, the high complexity of CRFs training only allows a very small tag (or label) set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation (CWS) as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40-50% without any significant performance loss on various large-scale data sets.