Hybrid methods for POS guessing of Chinese unknown words

  • Authors:
  • Xiaofei Lu

  • Affiliations:
  • The Ohio State University, Columbus, OH

  • Venue:
  • ACLstudent '05 Proceedings of the ACL Student Research Workshop
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a hybrid model that combines a rule-based model with two statistical models for the task of POS guessing of Chinese unknown words. The rule-based model is sensitive to the type, length, and internal structure of unknown words, and the two statistical models utilize contextual information and the likelihood for a character to appear in a particular position of words of a particular length and POS category. By combining models that use different sources of information, the hybrid model achieves a precision of 89%, a significant improvement over the best result reported in previous studies, which was 69%.