Part-of-speech tagging for Chinese-English mixed texts with dynamic features

  • Authors:
  • Jiayi Zhao;Xipeng Qiu;Shu Zhang;Feng Ji;Xuanjing Huang

  • Affiliations:
  • Fudan University, Shanghai, China;Fudan University, Shanghai, China;Fujitsu Research and Development Center, Beijing, China;Fudan University, Shanghai, China;Fudan University, Shanghai, China

  • Venue:
  • EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In modern Chinese articles or conversations, it is very popular to involve a few English words, especially in emails and Internet literature. Therefore, it becomes an important and challenging topic to analyze Chinese-English mixed texts. The underlying problem is how to tag part-of-speech (POS) for the English words involved. Due to the lack of specially annotated corpus, most of the English words are tagged as the oversimplified type, "foreign words". In this paper, we present a method using dynamic features to tag POS of mixed texts. Experiments show that our method achieves higher performance than traditional sequence labeling methods. Meanwhile, our method also boosts the performance of POS tagging for pure Chinese texts.