Improved source-channel models for Chinese word segmentation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
The first international Chinese word segmentation Bakeoff
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Integrating ngram model and case-based learning for Chinese word segmentation
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Adaptive Chinese word segmentation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Hi-index | 0.00 |
This paper demonstrates an integrative lexical analysis mechanism to solve the limitation of the existing lexical analysis systems with "pipelining". The integrative lexical analysis mechanism extends the Maximum Matching and Second-Maximum Matching model, POS (part of speech) and all the candidate words is included in the directed graph. The Dijkstra algorithm is applied to find the minimum cost path in the directed graph. With the integrative model, word segmentation, POS tagging and unknown words recognition are accomplished synchronously, the conflicts of all tasks of lexical analysis are avoided, high precision can be gained. The open test indicates the precision of the system is 98.65% and recall is 98.96%.