A Semi-supervised Learning Method for Vietnamese Part-of-Speech Tagging

  • Authors:
  • Le Minh Nguyen;Bach Ngo Xuan;Cuong Nguyen Viet;Minh Pham Quang Nhat;Akira Shimazu

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • KSE '10 Proceedings of the 2010 Second International Conference on Knowledge and Systems Engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a semi-supervised learning method for Vietnamese part of speech tagging. We take into account two powerful tagging models including Conditional Random Fields (CRFs)and the Guided Online-Learning models (GLs) as base learning models. We then propose a semi-supervised learning tagging model for both CRFs and GLs methods. The main idea is to use of a word-cluster model as an associate source for enrich the feature space of discriminate learning models for both training and decoding processes. Experimental results on Vietnamese Tree-bank data (VTB) showed that the proposed method is effective. Our best model achieved accuracy of 94.10\% when tested on VTB, and 92.60\% an independent test.