Chunking using conditional random fields in korean texts

  • Authors:
  • Yong-Hun Lee;Mi-Young Kim;Jong-Hyeok Lee

  • Affiliations:
  • Div. of Electrical and Computer Engineering POSTECH and AITrc, Pohang, R. of Korea;Div. of Electrical and Computer Engineering POSTECH and AITrc, Pohang, R. of Korea;Div. of Electrical and Computer Engineering POSTECH and AITrc, Pohang, R. of Korea

  • Venue:
  • IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a method of chunking in Korean texts using conditional random fields (CRFs), a recently introduced probabilistic model for labeling and segmenting sequence of data. In agglutinative languages such as Korean and Japanese, a rule-based chunking method is predominantly used for its simplicity and efficiency. A hybrid of a rule-based and machine learning method was also proposed to handle exceptional cases of the rules. In this paper, we present how CRFs can be applied to the task of chunking in Korean texts. Experiments using the STEP 2000 dataset show that the proposed method significantly improves the performance as well as outperforms previous systems.