A stochastic finite-state word-segmentation algorithm for Chinese
Computational Linguistics
Similarity-based methods for word sense disambiguation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Noun classification from predicate-argument structures
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
An iterative algorithm to build Chinese language models
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach
Computational Linguistics
Bootstrapping POS taggers using unlabelled data
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Combining segmenter and chunker for Chinese word segmentation
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Chinese word segmentation as LMR tagging
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
A maximum entropy Chinese character-based parser
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Chinese segmentation and new word detection using conditional random fields
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Effective self-training for parsing
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
A hybrid Markov/semi-Markov conditional random field for sequence segmentation
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Hi-index | 0.00 |
Generative language modeling and discriminative classification are two main techniques for Chinese word segmentation. Most previous methods have adopted one of the techniques. We present a hybrid model that combines the disambiguation power of language modeling and the ability of discriminative classifiers to deal with out-of-vocabulary words. We show that the combined model achieves 9% error reduction over the discriminative classifier alone.