A tutorial on hidden Markov models and selected applications in speech recognition
Readings in speech recognition
Ultraconservative online algorithms for multiclass problems
The Journal of Machine Learning Research
Inducing multilingual text analysis tools via robust projection across aligned corpora
HLT '01 Proceedings of the first international conference on Human language technology research
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Bootstrapping a multilingual part-of-speech tagger in one person-day
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Chinese segmentation and new word detection using conditional random fields
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
Learning with probabilistic features for improved pipeline models
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Multilingual part-of-speech tagging: two unsupervised approaches
Journal of Artificial Intelligence Research
A stacked sub-word model for joint Chinese word segmentation and part-of-speech tagging
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Chinese-English mixed text normalization
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
In modern Chinese articles or conversations, it is very popular to involve a few English words, especially in emails and Internet literature. Therefore, it becomes an important and challenging topic to analyze Chinese-English mixed texts. The underlying problem is how to tag part-of-speech (POS) for the English words involved. Due to the lack of specially annotated corpus, most of the English words are tagged as the oversimplified type, "foreign words". In this paper, we present a method using dynamic features to tag POS of mixed texts. Experiments show that our method achieves higher performance than traditional sequence labeling methods. Meanwhile, our method also boosts the performance of POS tagging for pure Chinese texts.