A stochastic finite-state word-segmentation algorithm for Chinese
Computational Linguistics
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Improving data driven wordclass tagging by system combination
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Transformation-based learning in the fast lane
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
A SNoW based supertagger with application to NP chunking
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Chinese word segmentation as LMR tagging
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Chinese word segmentation as LMR tagging
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Subword-based tagging for confidence-dependent Chinese word segmentation
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Chinese word segmentation and statistical machine translation
ACM Transactions on Speech and Language Processing (TSLP)
Integration of Named Entity Information for Chinese Word Segmentation Based on Maximum Entropy
ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
Combining Language Modeling and Discriminative Classification for Word Segmentation
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Word lattice reranking for Chinese word segmentation and part-of-speech tagging
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Subword-based tagging by conditional random fields for Chinese word segmentation
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
A dual-layer CRFs based joint decoding method for cascaded segmentation and labeling tasks
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Expert Systems with Applications: An International Journal
A Unified Character-Based Tagging Framework for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP)
Joint tokenization and translation
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
A stacked sub-word model for joint Chinese word segmentation and part-of-speech tagging
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A lexicon-constrained character model for chinese morphological analysis
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Integrating Generative and Discriminative Character-Based Models for Chinese Word Segmentation
ACM Transactions on Asian Language Information Processing (TALIP)
Iterative annotation transformation with predict-self reestimation for Chinese word segmentation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
ACM Transactions on Asian Language Information Processing (TALIP)
An empirical study on word segmentation for chinese machine translation
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Hi-index | 0.00 |
In this paper we present Chinese word segmentation algorithms based on the so-called LMR tagging. Our LMR taggers are implemented with the Maximum Entropy Markov Model and we then use Transformation-Based Learning to combine the results of the two LMR taggers that scan the input in opposite directions. Our system achieves F-scores of 95.9% and 91.6% on the Academia Sinica corpus and the Hong Kong City University corpus respectively.