Hybrid methods for POS guessing of Chinese unknown words

Authors:
Xiaofei Lu
Affiliations:
The Ohio State University, Columbus, OH
Venue:
ACLstudent '05 Proceedings of the ACL Student Research Workshop
Year:
2005

Citing 2
Cited 2

TnT: a statistical part-of-speech tagger

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Statistically-enhanced new word identification in a rule-based Chinese system

CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12

A method for automatic POS guessing of Chinese unknown words

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Fusion of multiple features and supervised learning for Chinese OOV term detection and POS guessing

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a hybrid model that combines a rule-based model with two statistical models for the task of POS guessing of Chinese unknown words. The rule-based model is sensitive to the type, length, and internal structure of unknown words, and the two statistical models utilize contextual information and the likelihood for a character to appear in a particular position of words of a particular length and POS category. By combining models that use different sources of information, the hybrid model achieves a precision of 89%, a significant improvement over the best result reported in previous studies, which was 69%.