Dynamic lexical acquisition in Chinese sentence analysis
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 2
Statistically-enhanced new word identification in a rule-based Chinese system
CLPW '00 Proceedings of the second workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 12
Chinese segmentation and new word detection using conditional random fields
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Discursive usage of six Chinese punctuation marks
COLING ACL '06 Proceedings of the 21st International Conference on computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
The use of SVM for chinese new word identification
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
A chunking strategy towards unknown word detection in chinese word segmentation
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
A lexicon-constrained character model for chinese morphological analysis
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Revising word lattice using support vector machine for Chinese word segmentation
Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
Hi-index | 0.00 |
Word segmentation in MSR-NLP is an integral part of a sentence analyzer which includes basic segmentation, derivational morphology, named entity recognition, new word identification, word lattice pruning and parsing. The final segmentation is produced from the leaves of parse trees. The output can be customized to meet different segmentation standards through the value combinations of a set of parameters. The system participated in four tracks of the segmentation bakeoff -- PK-open, PK-close, CTB-open and CTB-closed - and ranked #1, #2, #2 and #3 respectively in those tracks. Analysis of the results shows that each component of the system contributed to the scores.